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-Abstract- 

Priced timed games are two-player zero-sum games played on priced timed automata (whose 
locations and transitions are labeled by weights modeling the costs of spending time in a state 
and executing an action, respectively). The goals of the players are to minimise and maximise 
the cost to reach a target location, respectively. We consider priced timed games with one clock 
and arbitrary (positive and negative) weights and show that, for an important subclass of theirs 
(the so-called simple priced timed games), one can compute, in exponential time, the optimal 
values that the players can achieve, with their associated optimal strategies. As side results, we 
also show that one-clock priced timed games are determined and that we can use our result on 
simple priced timed games to solve the more general class of so-called reset-acyclic priced timed 
games (with arbitrary weights and one-clock). 

1998 ACM Subject Classification D.2.4 Software/Program Verification, F.3.1 Specifying and 
Verifying and Reasoning about Programs 

Keywords and phrases Priced timed games; Real-time systems; Game theory 

1 Introduction 

The importance of models inspired from the field of game theory is nowadays well-established 
in theoretical computer science. They allow to describe and analyse the possible interactions 
of antagonistic agents (or players) as in the controller synthesis problem, for instance. This 
problem asks, given a model of the environment of a system, and of the possible actions 
of a controller, to compute a controller that constraints the environment to respect a given 
specification. Clearly, one can not, in general, assume that the two players (the environment 
and the controller) will collaborate, hence the need to find a controller strategy that enforces 
the specification whatever the environment does. This question thus reduces to computing 
a so-called winning strategy for the corresponding player in the game model. 

In order to describe precisely the features of complex computer systems, several game 
models have been considered in the literature. In this work, we focus on the model of Priced 
Timed Games US] (PTGs for short), which can be regarded as an extension (in several 
directions) of classical finite automata. First, like timed automata [2], PTGs have clocks , 
which are real-valued variables whose values evolve with time elapsing, and which can be 
tested and reset along the transitions. Second, the locations are associated with price-rates 
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Figure 1 A simple priced timed game (left) and the lower value function of location l\ (right). 


and transitions are labeled by discrete prices, as in priced timed automata |U E2 O ■ These 
prices allow one to associate a cost with all runs (or plays), which depends on the sequence 
of transitions traversed by the run, and on the time spent in each visited location. Finally, 
a PTG is played by two players, called Min and Max, and each location of the game is owned 
by either of them (we consider a turn-based version of the game). The player who controls 
the current location decides how long to wait, and which transition to take. 

In this setting, the goal of Min is to reach a given set of target locations, following a 
play whose cost is as small as possible. Player Max has an antagonistic objective: he tries 
to avoid the target locations, and, if not possible, to maximise the accumulated cost up to 
the first visit of a target location. To reflect these objectives, we define the upper value Val 
of the game as a mapping of the configurations of the PTG to the least cost that Min can 
guarantee while reaching the target, whatever the choices of Max. Similarly, the lower value 
Val returns the greatest cost that Max can ensure (letting the cost being +oo in case the 
target locations are not reached). 

An example of PTG is given in Figure [l] where the locations of Min (respectively, Max) 
are represented by circles (respectively, rectangles), and the integers next to the locations 
are their price-rates, i.e., the cost of spending one time unit in the location. Moreover, 
there is only one clock x in the game, which is never reset and all guards on transitions 
are x £ [0,1] (hence this guard is not displayed and transitions are only labeled by their 
respective discrete cost): this is an example of simple priced timed game , as we will define 
them properly later. It is easy to check that Min can force reaching the target location if from 
all configurations (£, v) of the game, where t is a location and v is a real valuation of the clock 
in [0,1]. Let us comment on the optimal strategies for both players. From a configuration 
(£ 4 , 1 ^), with v £ [0,1], Max better waits until the clock takes value 1, before taking the 
transition to If (he is forced to move, by the rule of the game). Hence, Max’s optimal value 
is 3(1 — v) — 7 = —?>v — 4 from all configurations [l 4 , v). Symmetrically, it is easy to check 
that Min better waits as long as possible in £ 7 , hence his optimal value is —16(1 — v) from all 
configurations (£ 7 , 1 /). However, optimal value functions are not always that simple, see for 
instance the lower value function of l\ on the right of Figure [T] which is a piecewise affine 
function. To understand why value functions can be piecewise affine, consider the sub-game 
enclosed in the dotted rectangle in Figure |l] and consider the value that Min can guarantee 
from a configuration of the form (£ 3 , v ) in this sub-game. Clearly, Min must decide how long 
he will spend in £3 and whether he will go to £4 or £ 7 . His optimal value from all (£ 3 , v) is thus 
m i n (4£+ (— 3(u + t) — 4), 4t + 6 — 16(1 — (i/ + i))) = min(— 3v — 4,16^—10). Since 
16i^ —10 ^ —3^ — 4 if and only if v ^ 6/19, the best choice of Min is to move instantaneously 
to £7 if v £ [0,6/19] and to move instantaneously to £4 if v £ (6/19,1], hence the value 
function of £3 (in the subgame) is a piecewise affine function with two pieces. 
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Related work. PTGs were independently investigated in fS] and [T]. For (non-necessarily 
turn-based) PTGs with non-negative prices, semi-algorithms are given to decide the value 
problem that is to say, whether the lower value of a location (the best cost that Min can 
guarantee in valuation 0), is below a given threshold. They also showed that, under the 
strongly non-Zeno assumption on prices (asking the existence of k > 0 such that every 
cycle in the underlying region graph has a cost at least k), the proposed semi-algorithms 
always terminate. This assumption was justified in ElEj by showing that, in the absence 
of non-Zeno assumption, the existence problem , that is to decide whether Min has a strategy 
guaranteeing to reach a target location with a cost below a given threshold, is indeed un- 
decidable for PTGs with non-negative prices and three or more clocks. This result was 
recently extended in pj] to show that the value problem is also undecidable for PTGs with 
non-negative prices and four or more clocks. In [5j, the undecidability of the existence prob¬ 
lem has also been shown for PTGs with arbitrary price-rates (without prices on transitions), 
and two or more clocks. On a positive side, the value problem was shown decidable by m 
for PTGs with one clock when the prices are non-negative: a 3-exponential time algorithm 
was first proposed, further refined in mm into an exponential time algorithm. The key 
point of those algorithms is to reduce the problem to the computation of optimal values in 
a restricted family of PTGs called Simple Priced Timed Games (SPTGs for short), where 
the underlying automata contain no guard, no reset, and the play is forced to stop after one 
time unit. More precisely, the PTG is decomposed into a sequence of SPTGs whose value 
functions are computed and re-assembled to yield the value function of the original PTG. 
Alternatively, and with radically different techniques, a pseudo-polynomial time algorithm 
to solve one-clock PTGs with arbitrary prices on transitions, and price-rates restricted to 
two values amongst {— d, 0,+d} (with d £ N) was given in [T5] , 

Contributions. Following the decidability results sketched above, we consider PTGs with 
one clock. We extend those results by considering arbitrary (positive and negative) prices. 
Indeed, all previous works on PTGs with only one clock (except P3]) have considered non¬ 
negative weights only, and the status of the more general case with arbitrary weights has so 
far remained elusive. Yet, arbitrary weights are an important modeling feature. Consider, 
for instance, a system which can consume but also produce energy at different rates. In 
this case, energy consumption could be modeled as a positive price-rate, and production by 
a negative price-rate. We propose an exponential time algorithm to compute the value of 
one-clock SPTGs with arbitrary weights. While this result might sound limited due to the 
restricted class of simple PTGs we can handle, we recall that the previous works mentioned 
above Osmans] have demonstrated that solving SPTGs is a key result towards solving more 
general PTGs. Moreover, this algorithm is, as far as we know, the first to handle the full 
class of SPTGs with arbitrary weights, and we note that the solutions (either the algorithms 
or the proofs) known so far do not generalise to this case. Finally, as a side result, this 
algorithm allows us to solve the more general class of reset-acyclic one-clock PTGs that we 
introduce. Thus, although we can not (yet) solve the whole class of PTGs with arbitrary 
weights, our result may be seen as a potentially important milestone towards this goal. 

Some proofs and technical details are in the Appendix. 

2 Priced timed games: syntax, semantics, and preliminary results 

Notations and definitions. Let x denote a positive real-valued variable called clock. A 
guard (or clock constraint ) is an interval with endpoints in N U {+oo}. We often abbreviate 


4 


Simple Priced Timed Games Are Not That Simple 


guards, for instance x ^ 5 instead of [0,5]. Let S C Guard (a;) be a finite set of guards. We 
let [S'] = U/es I- Assuming Mq = 0 < Mi < • • • < Mk are all the endpoints of the intervals 
in S (to which we add 0), we let Reg s = {{Mi, M i+ 1 ) | 0 < z < k — 1} U {{Mi} | 0 ^ ^ k} 
be the set of regions of S. Observe that Reg s is also a set of guards. 

We rely on the notion of cost function to formalise the notion of optimal value function 
sketched in the introduction. Formally, for a set of guards S C Guard(x), a cost function 
over S is a function /: [Reg s ] —> R = R U {+oo, —oo} such that over all regions r £ Regs, 
/ is either infinite or a continuous piecewise affine function, with a finite set of outpoints 
(points where the first derivative is not defined) {k±, ..., k p } C Q, and with /(«;*) £ Q for 
all 1 ^ i ^ p. In particular, if f{r) = {/(i/) | v £ r} contains +oo (respectively, — oo) for 
some region r, then f(r) = {+oo} (f(r) = {—oo}). We denote by CFs the set of all cost 
functions over S. In our algorithm to solve SPTGs, we will need to combine cost functions 
thanks to the D> operator. Let / £ CFs and f £ CFs' be two costs functions on set of guards 
S,S' C Guard(:r), such that [S'] D [S'] is a singleton. We let / > /' be the cost function 
in CFsuS' such that (/ D> f){v) = f(v) for all v £ [Reg s ], and (/ > f){v) = f'(v) for all 
17 e [Regs'] \ [Reg s ]. 

We consider an extended notion of one-clock priced timed games (PTGs for short) allow¬ 
ing for the use of urgent locations, where only a zero delay can be spent, and final cost func¬ 
tions which are associated with each final location and incur an extra cost to be paid when 
ending the game in this location. Formally, a PTG Q is a tuple (LMin, Liviax, Lf, L u , ip, A, n) 
where (respectively, Liviax) is a finite set of locations for player Min (respectively, Max), 
with Lviin n LMax = 0; Lf is a finite set of final locations, and we let L = Lu\n ULiviaxUL/ be 
the whole location space; L u C L\Lf indicates urgent location^ A C {L\Lf) x Guard(a;) x 
{T,_L} x L is a finite set of transitions', <p = (c pfi)t e L f associates to each £ £ Lf its final 
cost function, that is an affine^] cost function pi over Sg = {I \ 3 £,R,£' : {£,I,R,C) £ A}; 
7 T: LUA-> Z mapping an integer price to each location—its price-rate —and transition. 

Intuitively, a transition {t, I, R, £') changes the current location from l to i' if the clock 
has value in I and the clock is reset according to the Boolean R. We assume that, in all PTGs, 
the clock x is bounded, i.e., there is M £ N such that for all guards I £ Sg, I C [0, M]Q 
We denote by Reg e the set Reg Sg of regions of Q. We further denot^Jby II g, Il[? c and 
IIg n respectively the values max^ g A l 7r ( < ^)|i ma x^ £ l | 7 t(£)| and sup^g^ M]max/ £ £ |<^(^)| = 
max{ £ imax(|^(0)|, |^(M)|). That is, Ilg, IIg c and Ilg 11 are the largest absolute values of 
the location prices, transition prices and final cost functions. 

Let Q = (Z/Min, Lm 3X , Lf, L u , ip, A, n) be a PTG. A configuration of Q is a pair s = {£, v) £ 

L x R + . We denote by Confg the set of configurations of Q. Let {l,v) and {£!,v') be two 

configurations. Let S = {£, I, R, £') £ A be a transition of Q and t £ R + be a delay. Then, 

~t S c 

there is a (t, 5)-transition from {t,v) to {£',v') with cost c, denoted by {l,v) > {£',v'), 

if (z) £ £ L u implies t = 0; (zz) v + t £ /; (Hi) R = T implies u' = 0; (iv) R = J_ 

implies v' = v + t; (v) c = tt( 5) + t x 7r(£). Observe that the cost of (t,S) takes into 

account the price-rate of £, the delay spent in £, and the price of S. We assume that 


1 Here we differ from [TO] where L u C Liviax- 

2 The afHne restriction on final cost function is to simplify our further arguments, though we do believe 
that all of our results could be adapted to cope with general cost functions. 

3 Observe that this last restriction is not without loss of generality in the case of PTGs. While all timed 
automata A can be turned into an equivalent (with respect to reachability properties) A! whose clocks 
are bounded [T], this technique can not be applied to PTGs, in particular with arbitrary prices. 

4 Throughout the paper, we often drop the Q in the subscript of several notations when the game is clear 
from the context. 
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the game has no deadlock: for all s 
s > s'. Finally, we write s A s' 


G Confg, there are ( t,S,c ) and s' G Confg such that 
whenever there are t and S such that s s f A 


play of Q is a finite or infinite path p = (£ 0 , i/ 0 ) 


P = (£o,vq) 
P = (£o, v 0 ) 


(h, v\) 

(hi vi) 


Cl 

Cl 


h- (hi v i) —U (t 2 , ^ 2 ) • • • ■ For a finite play 
(£ 2 ^ 2 ) • • • > (£ n ,v n ), we let |p| = n. For an infinite play 

(£ 2 , V 2 ) ■ ■ ■, we let |p| be the least position i such that £i G Lf 


if such a position exists, and \p\ = +00 otherwise. Then, we let Costg(p) be the cost of p, 
with Costg(p) = +00 if \p\ = + 00 , and Costg(p) = X^I=o 1 c i + t Pl\ P \ ( v \p\) otherwise. 

A strategy for player Min is a function <7Min mapping every finite play ending in loca¬ 
tion of Min to a pair (t,S) G R + x A, indicating what Min should play. We also request 
that the strategy proposes only valid pairs ( t,S ), i.e., that for all runs p ending in (£,v), 
cr Min (p) = (t,(£,I,R,£')) implies that u + t G I. Strategies o’Max of player Max are defined 
accordingly. We let Strat Min (^) and Strat Max (t/) be the sets of strategies of Min and Max, 
respectively. A pair of strategies (o’Min, o’Max) G StratMin(£) x StratM ax (£) is called a profile 
of strategies. Together with an initial configuration So = (£q,i/o), it defines a unique play 

S 2 ■ ■ ■ Sk —h • • • where for all j ^ 0 , Sj +1 is the unique 


Play (so, O’Min 5 0"Max) — So 
configuration such that Sj 
G L Min, and (tj,5j) o'M 


si - 
b - V , c . 


> Sj_l_i with (t 

( Co 

(s 0 — 


■ f j ) O’ [\y| j, 


^3 v x^Mm, ^IVIaxV^U r $1 ' c ’3~ 1 ' ^ 3 J ^3 

(respectively, Play(so, o’Min)) be the set of plays that conform with o’Min (and start in So). 


3i u 3 

Sj -1 Cj l > Sj) if £ 


{ Co 

(s 0 —» Si • 
G ^Max- 


c 3 ~ 1 . \ • r 

• • Sj- 1 - Sj) lf 

We let Play(o-Min) 


As sketched in the introduction, we consider optimal reachability-cost games on PTGs, 
where the aim of player Min is to reach a location of Lf while minimising the cost. To formal¬ 
ise this objective, we let the value of a strategy o’Min for Min be the function Valg Min : Confg —> 
R such that for all s G Confg: Valg Min (s) = sup CTMaxSStratMax Cost(Play(s, o’Min, o’Max))- Intu¬ 
itively, Valg Min (s) is the largest value that Max can achieve when playing against strategy 
o’Min of Min (it is thus a worst case from the point of view of Min). Symmetrically, for 
o”Max G StratMax, Valg Max (s) = inf( 7 Mi n eStratMin Cost(Play(s, o'Min, o”Max)), for all s G Confg. 
Then, the upper and lower values of Q are respectively the functions Valg : Confg —> R 
and Valg: Confg — > R where, for all s G Confg, Valg(s) = inf ffMine strat Mi „ Valg Min (s) and 
Valg(s) = sup CTMax ^ StratMax Valg Max (s). We say that a game is determined if the lower and the 
upper values match for every configuration s, and in this case, we say that the optimal value 
Valg of the game Q exists, defined by Valg = Val g = Valg. A strategy OMin of Min is optimal 
(respectively, e-optimal) if Valg Min = Valg (Valg Min ^ Valg + e), i.e., o’Min ensures that the 
cost of the plays will be at most Valg (Valg + e). Symmetrically, a strategy o’Max of Max is 
optimal (respectively, e-optimal) if Valg Max = Val g (Valg Max > Val g — e). 


Properties of the value. Let us now prove useful preliminary properties of the value func¬ 
tion of PTGs, that—as far as we know—had hitherto never been established. Using a general 
determinacy result by Gale and Stewart 0, we can show that PTGs (with one clock) are 
determined. Hence, the value function Valg exists for all PTG Q. We can further show that, 
for all locations £, Valg(f) is a piecewise continuous function that might exhibit discontinu¬ 
ities only on the borders of the regions of Regg (where Valg(f) is the function such that 
\Jo\g(£)(v) = Valg(f, v) for all v G R + ). See Appendix |a| for detailed proofs of these results. 
The continuity holds only in the case of PTGs with a single clock. An example with two 
clocks and a value function exhibiting discontinuities inside a region is in Appendix |B| 

► Theorem 1. For all (one-clock) PTGs Q: ( i ) Valg = Val g, i.e., PTGs are determined; 
and ( ii ) for all r G Regg, for all £ G L, Valg(£) is either infinite or continuous over r. 
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Simple priced timed games. As sketched in the introduction, our main contribution is to 
solve the special case of simple one-clock priced timed games with arbitrary costs. Formally, 
an r-SPTG, with r £ Q+ D [0,1], is a PTG Q = (^Min, I/Max, Lf, L u , <p, A, ir) such that for 
all transitions (£, I,R, £') € A, / = [0,r] and R = _L. Hence, transitions of 7 ’-SPTGs are 
henceforth denoted by (£, £'), dropping the guard and the reset. Then, an SPTG is a 1-SPTG. 
This paper is devoted mainly to proving the following theorem on SPTGs: 

► Theorem 2. Let Q be an SPTG. Then, for all locations £ £ L, the function Valg(t') is either 
infinite, or continuous and pieceuiise-affine with at most an exponential number of outpoints. 
The value functions for all locations, as well as a pair of optimal strategies (crMin, 0 Max) (that 
always exist if no values are infinite) can be computed in exponential time. 

Before sketching the proof of this theorem, we discuss a class of (simple) strategies that 
are sufficient to play optimally. Roughly speaking, Max has always a memoryless optimal 
strategy, while Min might need (finite) memory to play optimally—it is already the case in 
untimed quantitative reachability games with arbitrary weights (see Appendix[C|. Moreover, 
these strategies are finitely representable (recall that even a memoryless strategy depends 
on the current configuration and that there are infinitely many in our time setting). 

We formalise Max’s strategies with the notion of finite positional strategy (FP-strategy): 
they are memoryless strategies a (i.e., for all finite plays p± = p\ —-> s and pi = p ’ 2 s 
ending in the same configuration, we have cr(pi) = cr(p 2 )), such that for all locations £, 
there exists a finite sequence of rationals 0 ^ zq < zq < — • < zq = 1 and a finite sequence 
of transitions Si, ...,8 k € A such that (*) for all 1 < i < k, for all v £ (z'f.^zq], either 
o(£,v) = ( 0 , Si), or a(q,v) = (zq — v,5i ) (assuming zq = min( 0 , zq)); and (if) if zq > 0 , 
then a(£, 0) = (v[,8i). We let pts(d) be the set of zzf for all £ and i, and int(cr) be the 
set of all successive intervals generated by pts(er). Finally, we let |d| = |int(d)| be the size 
of a. Intuitively, in an interval (n(_^, zq], a always returns the same move: either to take 
immediately Si or to wait until the clock reaches the endpoint zq and then take Si. 

Min, however may require memory to play optimally. Informally, we will compute optimal 
switching strategies, as introduced in [T2| (in the untimed setting). A switching strategy 
is described by a pair (o’Min' °Min) °f FP-strategies and a switch threshold K, and consists 
in playing d^ until the total accumulated cost of the discrete transitions is below K; and 
then to switch to strategy o^p- The role of cr(j\ m is to ensure reaching a final location: it 
is thus a (classical) attractor strategy. The role of o^ip, on the other hand, is to allow 
Min to decrease the cost low enough (possibly by forcing negative cycles) to secure a cost 
below K, and the computation of cr^in is thus the critical point in the computation of 
an optimal switching strategy. To characterise cr^ in , we introduce the notion of negative 
cycle strategy (NC-strategy). Formally, an NC-strategy dMin of Min is an FP-strategy such 
that for all runs p = (£i,v) -4- •• • c - k ~-> (£ k ,v) £ Play(dMin) with £\ = £ k , and v,v' in 
the same interval of int(erMm), the sum of prices of discrete transitions is at most —1, i.e., 
tt(£i, £ 2 ) + • • • + n(£k-i,£k) ^ — 1. To characterise the fact that cr^ in must allow Min to 
reach a cost which is small enough, without necessarily reaching a target state, we define the 
fake value of an NC-strategy dMin from a configuration s as fakeg Min (s) = sup{Cost(p) | p £ 
Play(s, dMin), P reaches a target}, i.e., the value obtained when ignoring the dMm-induced 
plays that do not reach the target. Thus, clearly, fakeg Min (s) ^ Val CTMin (s). We say that an 
NC-strategy is fake-optimal if its fake value, in every configuration, is equal to the optimal 
value of the configuration in the game. This is justified by the following result whose proof 
relies on the switching strategies described before (see a detailed proof in Appendix |P|) : 
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Algorithm 1: solvelnstant (Cqi/) 

Input: r-SPTG Q = (-Liviin, -Liviax, Lf, L u , <p, A, n), a valuation v £ [0,r] 


1 foreach i £ L do 

2 if £ £ Lf then X(£) := <pi(v) else X(l) := Too 


repeat 


3 

4 

5 

6 

7 

8 until X = X 

9 return X 


X pre .— X 

foreach £ £ L Max do X(£) := max(y«) 6 A (n(£,f) + X pre (£')) 
foreach £ £ do X(£) := min(^q E A (n (£,£') + X pre (£ ')) 
foreach £ £ L such that X(£) < —(|L| — l)II tr — II fin do X(£) := 

•pre 


► Lemma 3. //Val g (£, v) yf +oo, for all £ and v, then for all NC-strategies crwm, there is a 
strategy <7’ Um such that Val a Min 0) < fake g Min (s) for all configurations s. In particular, if (Jw\\ n 
is a fake-optimal NC-strategy, then <r^ !n is an optimal (switching) strategy of the SPTG. 

Then, an SPTG is called finitely optimal if (i) Min has a fake-optimal NC-strategy; 
(ii) Max has an optimal FP-strategy; and (in) Valg(f) is a cost function, for all locations £. 
The central point in establishing Theorem [2] will thus be to prove that all SPTGs are 
finitely optimal, as this guarantees the existence of well-behaved optimal strategies and 
value functions. We will also show that they can be computed in exponential time. The proof 
is by induction on the number of urgent locations of the SPTG. In Section [3] we address the 
base case of SPTGs with urgent locations only (where no time can elapse). Since these SPTGs 
are very close to the untimed min-cost reachability games of | 12 j , we adapt the algorithm 
in this work and obtain the solvelnstant function (Algorithm [TJ . This function can also 
compute Val g (f,l) for all £ and all games Q (even with non-urgent locations) since time 
can not elapse anymore when the clock has valuation 1. Next, using the continuity result 
of Theorem [I] we can detect locations £ where Val g (f, v) £ {+oo,—oo}, for all v £ [0,1], 
and remove them from the game. Finally, in Section [4] we handle SPTGs with non-urgent 
locations by refining the technique of [TUI HZ] (that work only on SPTGs with non-negative 
costs). Compared to | 1()1 T7], our algorithm is simpler, being iterative, instead of recursive. 

3 SPTGs with only urgent locations 

Throughout this section, we consider an r-SPTG Q = (Tm^, ^Max, Lf, L u , ip, A, 7 r) where all 
locations are urgent, i.e., L u = LMin U ^Max- We first explain briefly how we can compute 
the value function of the game for a fixed clock valuation v £ [ 0 , r] (more precisely, we can 
compute the vector (\/a\g(£, v))( € l)- Since no time can elapse, we can adapt the techniques 
developed in m to solve (untimed) min-cost reachability games. The adaptation consists 
in taking into account the final cost functions (see Appendix [E]). This yields the function 
solvelnstant (Algorithm [l]), that computes the vector (Valg(f, v))e^L for a fixed v. The 
results of m also allow us to compute associated optimal strategies: when Val(£, v) 

{—oo, Too} the optimal strategy for Max is memoryless, and the optimal strategy for Min is 
a switching strategy (cr^, in , er^n) with a threshold K (as described in the previous section). 

Now let us explain how we can reduce the computation of Val g (^) : v £ [0, r] i—»• Val(£, v) 
(for all £) to a finite number of calls to solvelnstant. Let F g be the set of affine functions 
over [0, r] such that Fg = {k T \ £ £ Lf /\ k £ X }, where X = [—(|L| — l)n tr , |L|II tr ] n Z. 
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Observe that Fg has cardinality 2|L| 2 n tr , i.e., pseudo-polynomial in the size of Q. From 
[T2] , we conclude that the functions in Fg are sufficient to characterise Valg, in the following 
sense: for all £ G L and v G [0,r] such that Val(f, v) ^ {—oo,+oc}, there is / G Fg with 
Val(£, v) = f(v) (see Lemma [l 6 | Appendix |E| for the details). Using the continuity of Valg 
(Theorem [l]), we show that all the outpoints of Valg are intersections of functions from Fg, 
i.e., belong to the set of possible outpoints PossCPg = {v G [0,r] | 3 /i ,/ 2 G Fg /i / 
/ 2 A fi(v) = / 2 (^)}. Observe that PossCPg contains at most |Fg | 2 = 4|L/| 4 (II tr ) 2 points 
(also a pseudo-polynomial in the size of G) since all functions in Fg are affine, and can 
thus intersect at most once with every other function. Moreover, PossCPg C Q, since all 
functions of Fg take rational values in 0 and r G Q. Thus, for all £, Valg(f) is a cost function 
(with outpoints in PossCPg and pieces from Fg). Since Val g(£) is a piecewise affine function, 
we can characterise it completely by computing only its value on its outpoints. Hence, we 
can reconstruct Valg(£) by calling solvelnstant on each rational valuation v G PossCPg. 
From the optimal strategies computed along solvelnstant m , we can also reconstruct a 
fake-optimal NC-strategy for Min and an optimal FP-strategy for Max, hence: 

► Proposition 4. Every r-SPTG Q with only urgent locations is finitely optimal. Moreover, 
for all locations £, the piecewise affine function Valg(£) has outpoints in PossCPg of cardin¬ 
ality 4|L/| 4 (n tr ) 2 , pseudo-polynomial in the size of Q. 

4 Solving general SPTGs 

In this section, we consider SPTGs with possibly non-urgent locations. We first prove that 
all such SPTGs are finitely optimal. Then, we introduce Algorithm [2] to compute optimal 
values and strategies of SPTGs. To the best of our knowledge, this is the first algorithm 
to solve SPTGs with arbitrary weights. Throughout the section, we fix an SPTG Q = 
(Liviin, Tiviax, -k/, £«, V 5 , A, 7 r) with possibly non-urgent locations. Before presenting our core 
contributions, let us explain how we can detect locations with infinite values. As already 
argued, we can compute Val(^, 1) for all £ assuming all locations are urgent, since time can 
not elapse anymore when the clock has valuation 1. This can be done with solvelnstant. 
Then, by continuity, Val(£, 1) = +oo (respectively, Val(£, 1) = —oo), for some £ if and only if 
Val(£, v) = +oo (respectively, Val(£, u) = — oo) for all v G [0,1]. We remove from the game 
all locations with infinite value without changing the values of other locations (as justified 
in m- Thus, we henceforth assume that Val(£, v) G R for all (l,v). 

The Ql',t construction. To prove finite optimality of SPTGs and to establish correctness 
of our algorithm, we rely in both cases on a construction that consists in decomposing Q 
into a sequence of SPTGs with more urgent locations. Intuitively, a game with more urgent 
locations is easier to solve since it is closer to an untimed game (in particular, when all 
locations are urgent, we can apply the techniques of Section [3]). More precisely, given a 
set L' of non-urgent locations, and a valuation r o G [0,1], we will define a (possibly infinite) 
sequence of valuations 1 = ro > r\ > • • • and a sequence Qu,r 0 i Gl 1 ,ru • • • of SPTGs such 
that (z) all locations of Q are also present in each Gu . ri , except that the locations of L' 
are now urgent; and (n) for all i ^ 0, the value function of Gl>, r t is equal to Valg on the 
interval [ri+i,rj]. Hence, we can re-construct Valg by assembling well-chosen parts of the 
values functions of the Gl< ,n (assuming inf,; r* =0). This basic result will be exploited in two 
directions. First, we prove by induction on the number of urgent locations that all SPTGs are 
finitely optimal, by re-constructing V a\g (as well as optimal strategies) as a ^-concatenation 
of the value functions of a finite sequence of SPTGs with one more urgent locations. The 
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base case, with only urgent locations, is solved by Proposition [4] This construction suggests 
a recursive algorithm in the spirit of [TD] [T7] (for non-negative prices). Second, we show that 
this recursion can be avoided (see Algorithm |2| . Instead of turning locations urgent one at 
a time, this algorithm makes them all urgent and computes directly the sequence of SPTGs 
with only urgent locations. Its proof of correctness relies on the finite optimality of SPTGs 
and, again, on our basic result linking the values functions of Q and games Ql' , Ti - 

Let us formalise these constructions. Let Q be an SPTG, let r £ [0,1] be an endpoint, 
and let x = (xe)t£L be a vector of rational values. Then, wait(t/, r, a;) is an r-SPTG in which 
both players may now decide, in all non-urgent locations £, to wait until the clock takes value 
r, and then to stop the game, adding the cost xe to the current cost of the play. Formally, 
wait(Cy, r, x) = (L Mln , L Max , L' f , L u , ip',T', n') is such that L' f = Lf l±l {£ f \ l £ L\L U }; for 
all £! £ Lf and v £ [0,r], <p' t ,{v) = for all i £ L \ L u , = (r — v) ■ n(£) + xf, 

T' = T U {(£, [0, r], _L, £f) \ i £ L \ L u }; for all S £ T' , n’(S) = n(S) if S £ T, and n'(S) = 0 
otherwise. Then, we let Q r = wait {Q,r, (Valg(£, r))^ S i), i.e., the game obtained thanks to 
wait by letting x be the value of Q in r. One can check that this first transformation does 
not alter the value of the game, for valuations before r: \/a\g(£, v) = V a\g r (£, v) for all v ^ r. 

Next, we make locations urgent. For a set L' C L \ L u of non-urgent locations, we let 
Ql’, r be the SPTG obtained from Q r by making urgent every location £ of L'. Observe that, 
although all locations £ £ L' are now urgent in Ql their clones U allow the players to wait 
until r. When L' is a singleton {£}, we write Q^ r instead of Q{t}, r - While the construction 
of Q r does not change the value of the game, introducing urgent locations does. Yet, we can 
characterise an interval [a,r] on which the value functions of TL = Ql',t and TL + = GL'u{£},r 
coincide, as stated by the next proposition. The interval [a,r] depends on the slopes of the 
pieces of Valas depicted in Figure [2] for each location £ of Min, the slopes of the pieces 
of Val-^+ contained in [a,r] should be ^ — n(£) (and ^ — ir(£) when £ belongs to Max). It 
is proved by lifting optimal strategies of T~L + into H, and strongly relies on the determinacy 
result of Theorem [lj 


► Proposition 5. Let 0^a<r^l, L’ C L\L U and t ^ L'UL U a non-urgent location o/Min 
(respectively, Max/ Assume that QL'u{i},r finitely optimal, and for all a ^ < V 2 ^ r 




v 2 - ^1 


^ -7r(^) 


( respectively, ^ —7r(£)). 


(1) 


Then , for all v £ [a,r ] and £! £ L , Valg i/u{£} r (£', v) = Valg i( r (£', v). Furthermore, fake- 
optimal NC-strategies and optimal FP-strategies in Ql' u{£},r o-xe also fake-optimal and op¬ 
timal over [ a,r ] in Ql',t- 


Given an SPTG Q and some finitely optimal Qu,ri now characterise precisely the 
left endpoint of the maximal interval ending in r where the value functions of Q and Ql',t 
coincide, with the operator left//: (0,1] —► [0,1] (or simply left, if L ' is clear) defined as: 


leftf/(r) = sup{r' < r | M£ £ L Mv £ [/, r] Val g L , r (£, v) = Ma\g(£, v)} . 

By continuity of the value (Theorem [l]) , this supremum exists and Valg(£, lefti,/(r)) = 
Valg t , r (£, lefti/(r)). Moreover, Valg(^) is a cost function on [left(r),r], since Ql\t is fi¬ 
nitely optimal. However, this definition of left(r) is semantical. Yet, building on the ideas of 
Proposition |5j we can effectively compute left(r), given Valg^, r . We claim that left£/(r) is 
the minimal valuation such that for all locations £ £ L' fl (respectively, t £ L' fl Lmbx), 
the slopes of the affine sections of the cost function Valg^, r (£) on [left(r),r] are at least (at 
most) — 7 r(f) (see Lemma 20 in appendix). Hence, left(r) can be obtained (see Figure [3]), 
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Valg^, v) 



Figure 2 The condition 0 (in the case 
L' = 0 and l £ I/Min): graphically, it means 
that the slope between any two points of the 
plot in [a, r] (represented with a thick line) is 
greater than or equal to — n(£) (represented 
with dashed line). 


Va! 



Figure 3 In this example L' = {£*} and 
£* £ I/Min- ieft(r) is the leftmost point such 
that all slopes on its right are smaller than or 
equal to — n(£*) in the graph of Valg { * (£*, v). 
Dashed lines have slope —n(£*). 


by inspecting iteratively, for all £ of Min (respectively, Max), the slopes of \Ja\g L , r {£), by 
decreasing valuations, until we find a piece with a slope > — 7 r(£) (respectively, < — n(£)). 
This enumeration of the slopes is effective as \/a\g L , r has finitely many pieces, by hypothesis. 
Moreover, this guarantees that left(r) < r. Thus, one can reconstruct Val g on [inf, /y, r 0 ] 
from the value functions of the (potentially infinite) sequence of games Gv,r 0 i Gv ,nv 
where r i+ 1 = left(rj) for all i such that ry > 0, for all possible choices of non-urgent loca¬ 
tions L'. Next, we will define two different ways of choosing L'\ the former to prove finite 
optimality of all SPTGs, the latter to obtain an algorithm to solve them. 

SPTGs are finitely optimal. To prove finite optimality of all SPTGs we reason by induction 
on the number of non-urgent locations and instantiate the previous results to the case where 
L' = {£*} where £* is a non-urgent location of minimum price-rate (i.e., for all £ £ L, 
7 t(£*) < 7 r(f)). Given r 0 £ [0,1], we let r 0 > rq > ••• be the decreasing sequence of 
valuations such that r, = leftq*(ry_i) for all * > 0. As explained before, we will build Val g 
on [infiTi,ro] from the value functions of games Ge*,n- Assuming finite optimality of those 
games, this will prove that G is finitely optimal under the condition that ro > rq > 
eventually stops, i.e., rq = 0 for some i. This property is given by the next lemma, which 
ensures that, for all i , the owner of l* has a strictly better strategy in configuration (l*, rq+i) 
than waiting until r, in location £*. 

► Lemma 6. If Ge*, ri is finitely optimal for all % ^ 0, then (?) if £* £ (respect¬ 

ively, L Max ), Val g (£*,r i+ i) < Val g (£*, rfj + (n - r i+1 )Tv(£*) (respectively , Val e (£*, r i+1 ) > 
Va I g(£*,ri) + (ri — ri+ \)t:{£*)), for all i; and ( ii ) there is i ^ |Fg| 2 + 2 such that r, = 0. 

By iterating this construction, we make all locations urgent iteratively, and obtain: 

► Proposition 7. Every SPTG Q is finitely optimal and for all locations £. Val g (£) has at 
most O ((lT r |L| 2 ) 2 l i l+ 2 ) outpoints. 

Proof. As announced, we show by induction on n ^ 0 that every ?’-SPTG Q with n non¬ 
urgent locations is finitely optimal, and that the number of cutpoints of Val g (£) is at most 
o((n tr (|L/| + n 2 )) 2n+2 ), which suffices to show the above bound, since \Lf \ + n 2 ^ |L| 2 . 
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The base case n = 0 is given by Proposition [4] Now, assume that G has at least 
one non-urgent location, and consider £* one with minimum price. By induction hypo¬ 
thesis, all r'-SPTGs Ge*y are finitely optimal for all r' £ [0,r]. Let rg > r i > ••• 
be the decreasing sequence defined by ro = r and 7 \ = leffy* ( 7 y_i) for all i ^ 1. By 
Lemma [fi] there exists j ^ |Fg | 2 + 2 such that ry = 0. Moreover, for all 0 < i ^ j, 
Valg = Valg^ ^ i on [r,,r,_i] by definition of r* = left/.*(r,;_i), so that Valg(£) is a cost 
function on this interval, for all i, and the number of cutpoints on this interval is bounded 
by O ((n tr (|L/| + (n — l ) 2 + n)) 2 ^ n_1 ^ +2 ) = O ((II tr (|L/| + n 2 )) 2 *-" -1 ^”*” 2 ) by induction hy¬ 
pothesis (notice that maximal transition prices are the same in Q and Ge*,r t -u but that we 
add n more final locations in Ge*,n-i)- Adding the cutpoint 1 , summing over i from 0 to 
j ^ |Fg | 2 + 2, and observing that |Fg| ^ 2U tr \Lf\, we bound the number of cutpoints of 
ValsW by 0((U^(\L f \ + n 2 )) 2n+2 ). Finally, we can reconstruct fake-optimal and optimal 
strategies in G from the from fake-optimal and optimal strategies of Ge* :ri . ◄ 

Computing the value functions. The finite optimality of SPTGs allows us to compute the 
value functions. The proof of Proposition [7] suggests a recursive algorithm to do so: from an 
SPTG G with minimal non-urgent location £*, solve recursively Ge*, l, Ge*, left(i)? !?£*,ieft(ieft(i))> 
etc. handling the base case where all locations are urgent with Algorithm]!] While our results 
above show that this is correct and terminates, we propose instead to solve—without the 
need for recursion—the sequence of games Gl\l u , ii Gl\l u , left(i); • • • he., making all locations 
urgent at once. Again, the arguments given above prove that this scheme is correct , but 
the key argument of Lemma [6] that ensures termination can not be applied in this case. 
Instead, we rely on the following lemma, stating, that there will be at least one cutpoint 
of Valg in each interval [left(r),r]. Observe that this lemma relies on the fact that G is 
finitely optimal, hence the need to first prove this fact independently with the sequence 
Ge*,i, Ge*. ieft(i), Ge* .left(left(l))r • • Termination then follows from the fact that V alg has finitely 
many cutpoints by finite optimality. 

► Lemma 8. Let ro £ (0,1] such that Gu ,r 0 is finitely optimal. Suppose thatri = /eftx/(ro) > 
0, and let r 2 = lefti,>(r i). There exists r' £ [r^ri) and £ £ L' such that [(i)] (i) V a\g(£) is 
affine on [r',ri], of slope equal to and (ii) Valg(£, ri) Valg(£, ro) + 7 r(£)(ro — ri). 

As a consequence. Val g(£) has a cutpoint in [ri,ro). 

Algorithm [2] implements these ideas. Each iteration of the while loop computes a 
new game in the sequence Gl\l u , i, Gl\l u .\s7i(i)i ■ ■ ■ described above; solves it thanks to 
solvelnstant; and thus computes a new portion of Valg on an interval on the left of the 
current point r £ [0,1]. More precisely, the vector (Valg(£, l))eeL is first computed in line]!] 
Then, the algorithm enters the while loop, and the game G' obtained when reaching line [ 6 ] 
is G l\l u .i- Then, the algorithm enters the repeat loop to analyse this game. Instead of 
building the whole value function of G\ Algorithm [2] builds only the parts of Valg/ that co¬ 
incide with Valg. It proceeds by enumerating the possible cutpoints a of Valg/, starting in r, 
by decreasing valuations (line [ 8 ]), and computes the value of Valg/ in each cutpoint thanks 
to solvelnstant (line|9|, which yields a new piece of Valg/. Then, the if in line [To] checks 
whether this new piece coincides with Valg, using the condition given by Proposition |5] If it 
is the case, the piece of Valg/ is added to fe (line[TT|); repeat is stopped otherwise. When ex¬ 
iting the repeat loop, variable b has value left(l). Hence, at the next iteration of the while 
loop, G' = Gl\l u . left(i) when reaching line [ 6 ] By continuing this reasoning inductively, one 
concludes that the successive iterations of the while loop compute the sequence Gl\l u , i, 
Gl\l u , left(i))-- - as announced, and rebuilds Valg from them. Termination in exponential 
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Algorithm 2: solve(C?) 


4 

5 

6 

7 

8 
9 

10 

11 

12 


/* ft '■ {1} t R */ 


Input: SPTG Q = (L Min ,L Max ,L f ,L u ,ip,A,Tr) 

1 / = ( fe)teL ■= solveInstant(C/, 1) 

2 r := 1 

3 while 0 < r do /* Invariant: //: [r, 1] —>• R */ 

Q' := wait(£/,r, f{r)) /* r-SPTG Q' = (Z, Min , i Max , L' f , L' u , <p', T',7r') */ 

:= L’ U U L /* every location is made urgent */ 

b := r 

repeat /* Invariant: _//: [ 6 ,1] — R */ 

a := max(PossCPg/ n [0, b )) 

a; = (xe)t£L '■= solveInstant(ty', a) /* xi = Valg/(£, a) */ 

if W G i M in < -TrW A Vf G L Max ^ -7r(£) then 

foreach f G L do /e := (i/ G [a, 6 ] >->• f e (b) + (v - b) fe ^~ xt ) t> ft 
b := a; stop := false 

13 I else stop := true 

14 until b = 0 or stop 

15 r := b 


16 return / 


time is ensured by Lemma [ 8 ] each iteration of the while loop discovers at least one new 
cutpoint of Val e , an d there are at most exponentially many (note that a tighter bound on 
this number of outpoints would entail a better complexity of our algorithm). 


► Example 9. Let us briefly sketch the execution of Algorithm [2] on the SPTG in Figure [l] 
During the first iteration of the while loop, the algorithm computes the correct value func¬ 
tions until the cutpoint in the repeat loop, at first a = 9/10 but the slope in i\ is smaller 
than the slope that would be granted by waiting, as depicted in Figure [T] Then, a = 3/4 
where the algorithm gives a slope of value —16 in £2 while the cost of this location of Max 
is —14. During the first iteration of the while loop, the inner repeat loop thus ends with 
r = 3/4. The next iterations of the while loop end with r = \ (because does not pass the 
test in line 10 1; r = | (because of If) and finally with r = 0 , giving us the value functions 


on the entire interval [0,1]. All value functions are in Figure 12 in the appendix. 


5 Beyond SPTGs 

In Humana, general PTGs with non-negative prices are solved by reducing them to a finite 
sequence of SPTGs, by eliminating guards and resets. It is thus natural to try and adapt 
these techniques to our general case, in which case Algorithm [2] would allow us to solve 
general PTGs with arbitrary costs. Let us explain why it is not (completely) the case. The 
technique used to remove guards from PTGs consists in enhancing the locations with regions 
while keeping an equivalent game. This technique can be adapted to arbitrary weights, see 
Appendix |H| for a proof adapted from |T5) Lemma 4.6]. 

The technique to handle resets, however, consists in bounding the number of clock resets 
that can occur in any play following an optimal strategy of Min or Max. Then, the PTG can 
be unfolded into a reset-acyclic PTG with the same value. By reset-acyclic, we mean that 
no cycle in the configuration graph visits a transition with a reset. This reset-acyclic PTG 
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Figure 4 A PTG where the number of resets in optimal plays can not be bounded a priori. 


can be decomposed into a finite number of components that contain no reset and are linked 
by transitions with resets. These components can be solved iteratively, from the bottom to 
the top, turning them into SPTGs. Thus, if we assume that the PTGs we are given as input 
are reset-acyclic, we can solve them in exponential time , and show that their value functions 
are cost functions with at most exponentially many outpoints, using our techniques (see 
Appendix [H|. Unfortunately, the arguments to bound the number of resets do not hold for 
arbitrary costs, as shown by the PTG in Figure [4j We claim that Val(£ 0 ) = 0; that Min has 
no optimal strategy, but a family of e-optimal strategies cr^ in each with value e; and that 
each er^im requires memory whose size depends on e and might yield a play visiting at least 
1/e times the reset between £q and £-\ (hence the number of resets can not be bounded). For 
all e > 0, cr^ in consists in: waiting 1 — e time units in £q, then going to £\ during the |"l/e~| 
first visits to £q; and to go directly to if afterwards. Against errin’ Max has two possible 
choices: ( i ) either wait 0 time unit in £i, wait e time units in £ 2 , then reach £f, or (ii) wait 
e time unit in l\ then force the cycle by going back to £q and wait for Min’s next move. 
Thus, all plays according to cr^ in will visit a sequence of locations which is either of the form 
£o(i\io) k £\£i£fi with 0 < k < [1/e]; or of the form £ 0 (£i£ 0 )^^£f. In the former case, the 
cost of the play will be — ke + 0 + e = — (k — l)e < e; in the latter, —e(|T/e]) + 1^0. This 
shows that Val(^o) = 0, but there is no optimal strategy as none of these strategies allow 
one to guarantee a cost of 0 (neither does the strategy that waits 1 time unit in £q). 

However, we may apply the result on reset-acyclic PTGs to obtain: 

► Theorem 10. The value functions of all one-clock PTGs are cost functions with at most 
exponentially many outpoints. 

Proof. Let Q be a one-clock PTG. Let us replace all transitions (£,g,T,£') resetting the 
clock by (£,g,X,£"), where t" is a new final location with Lppn = Valg(£,0)—observe that 
Val g (.£,0) exists even if we can not compute it, so this transformation is well-defined. This 
yields a reset-acyclic PTG Q' such that Valgy = Valg. ◄ 
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A Existence and continuity of the value functions: proof of Theorem [I] 

We start with the proof of determinacy. For all k £ R, define Threshold(Q, r) as the qualitative game which is 
played like Q , and only the objective of Min is altered (in order to make it qualitative): now Min wins a play if 
and only if the cost of the play is ^ k. Further, let P(k) be the set of prefixes of runs ending in a final vertex and 
whose cost is less than or equal to k. Then the set of winning plays for Min in this game is S = (JpeP(fc) Cone(p) 
where Cone(p) denotes the set of plays having p as a prefix. The set S is an open set in the topology induced 
by cones. In El, it is shown that in any game whose set of winning plays is an open set is determined , i.e. one 
of the two players has a winning strategy. Therefore Threshold ^, k) is determined for all k. 

Now let us prove that Vai n = Valg. First, recall that, by definition of Vai n and Valg: 

yai e (c)^Valg(c) (2) 

for all configurations c. Fix a configuration c. We consider several cases: 

1. First assume that Valg(c) £ R. By definition, for all k > Val g(c) and all strategies cj M ax , Valg M “(c) < k. 
Hence, for all k > Val g(c), Max has no winning strategy in the game Threshold(Q , k) . Therefore, by 
determinacy of this game, Min has a winning strategy. Equivalently, for all k > Valg(c), there exists <7^ in 

such that Valg Min (c) ^ k. This implies that: 

Valg(cXYa!g(c) (3) 

Hence, by ^ and Q we conclude that: VaUfcl = Valg(c) when these values are finite. 

2. In the case where Valg(c) = +oo, we conclude, by that Valg(c) = +cxd too. 

3. Finally, in the case where VaU(c) = — oo then for all k, Max has no winning strategy for Threshold(Q , k ). 
Therefore, by determinacy, Min has a winning strategy cr^ in in Threshold(Q , k). Thus, for all k : Valg ^ 

Valg Min (c) < k, and: Valg = —oo. 


We then turn to the proof of continuity. Therefore, our goal is to show that for every location £, region 
r £ Regg and valuations v and v' in r, 

|Val(f, v) — Val(£, i/) \ < H loc |i/ - v'\. 


This is equivalent to showing 

Val(£, v) s; Val(£, v') + n lot > - v'\ and Val(£, v') < Val(£, v) + n loc |p - v'\. 

As those two equations are symmetric with respect to v and v', we only have to show either of them. We will 
thus focus on the latter, which, by using the upper value, can be reformulated as: for all strategies dMin of Min, 
there exists a strategy such that Val CTMin (£, i/) ^ Val <TMin (t', v) + n loc |i/ — v'\. Note that this last equation 
is equivalent to say that there exists a function g mapping plays p' from consistent with cr' Um (i.e., such 

that p' = Play((£, v'), cr' mn , dMax) for some strategy dMax of Max) to plays from (£, v), consistent with dMin, such 
that 


Cost(p') < Cost {g{p')) + n loc |p — v'\ . 


Let r £ Regg, v, v' £ r and dMin be a strategy of Min. We define d' Min and g by induction on the size of their 
arguments; more precisely, we define dj VHn (p' 1 ) and g(p' 2 ) by induction on k, for all plays p[ and p' 2 from 
consistent with d^ in of size k — 1 and k, respectively. We also show during this induction that for each play 


P' = (4Xi) 


(Ik, v' k ) from (l, is'), consistent with dj^), if we let (i\,v\) 


^ (4,^) = g(p')- 


( i ) p' and g(p') have the same length, i.e., \p\ = £ = k = \p'\, 

( ii ) for every i £ {1,..., fc}, i/, and v( are in the same region, i.e., there exists a region r' £ Regg such that 
Vi £ r' and v[ £ r ', 

(in) \v k - v k \ < \v - v'\, 

(iv) Cost (p') < Cost (g(p')) + n loc (|p - v'\ - \v n - v' n \). 
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(a) (b) (c) 

Figure 5 The definition of t' when (a) v' k ^ v k , (b) v k <v' h < u k + t, (c) v k < v k +1 < v' k . 


Notice that no property is required on the strategy for finite plays that do not start in {£, v'). 

If k = 1, as there is no play of length 0, nothing has to be done to define <j^ in . Moreover, in that case, 
p' = (£,v') and g(p') = (£,v). Both plays have size 1, v and v’ are in the same region by hypothesis of the 
lemma, and Cost(p') = Cost(g(//)) = 0, therefore all four properties are true. 

Let us suppose now that the construction is done for a given k ^ 1, and perform it for k + 1. We start 

with the construction of cr^| in . To that extent, consider a play p' = {£i,v[) {£ k ,v' k ) from (£,v'), 

consistent with such that £ k is a location of player Min. Let t and S be the choice of delay and transition 
made by crMin on g(p'), i.e., o'Min (g(p')) = Then, we define cr' mn {pf} = {if ,6) where t' = max(0, v k + t — v k ). 

The delay t' respects the guard of transition (5 since either v k +1 = v' k +1' or v k sj v k +1 ^ v ' k , in which case 
v' k is in the same region as v k +1 since v k and v k are in the same region. This is illustrated in Figure [5] 

We now build the mapping g. Let p' = {£\, v[) -4- • • • —4- {£ k+ 1 , v ' k+l ) be a play from {£, v') consistent with 

a' mn and p' = (£i, v[) -^4- • • • k ~ 1 > {£ k , v' k ) its prefix of size k. Let {if , S) be the delay and transition taken after 

p'. Using the construction of g over plays of length k by induction, the play g{p') = {£\ 1 v\) —4 • • • k l > {£ k , v k ) 
(with (£i,zzi) = {£, v)) verifies properties (i), (ii) and (iii). If £ k is a location of Min and OMin^^)) = (t, S), 
then g{p') = g{p') -^4 {£ k +\, v k +\) is obtained by applying those choices on g{p'). If £k is a location of Max, the 
last valuation v k +i of g{p') is rather obtained by choosing action (t, S) verifying t = max(0, v k + t' — v k ). Note 
that transition S is allowed since both v k +1 and v k + if are in the same region (for similar reasons as above). 

By induction hypothesis \p'\ = \g{p')\, thus (i) holds, i.e., \p'\ = \g{p')\. Moreover, v k+1 and v k+1 are also 
in the same region as either they are equal to v k + t and v' k + t', respectively, or S contains a reset in which 
case v k+ 1 = v ' k+1 = 0 which proves (ii). To prove (iii), notice that we always have either v k + t = v' k + if or 

v k ^ v k +1 ^ u ' k = u ' k + ^ or v ' k ^ u 'k + ^ ^ v k = v k+t- I n °f these possibilities, we have | {v k + t) — {v' k + t) | < 

\v k — v' k \. By noticing again that either v k +i = v k + t and v ' k , 1 = v' k +1', or S contains a reset in which case 
v k +\ = v k+1 = 0, we conclude the proof of (iii). We finally check property (iv). In both cases: 

Cost(p') = Cost(p') + 7 r(S) + t'n{£ k ) 

< Cost (g(p')) +II loc (|j/ - v’\ - \v k -v k \) + 7 t(5 ) + if n(£ k ) 

= Cost (g(p')) + ( t' - t)w(£ k ) + n loc (|i/ - v\ - \v k - u k |). 

If 5 contains no reset, let us prove that 

\t'-t\ = Wk ~ v' k \- K+i - Vk+il • (4) 

Indeed, since if = v' k , x — v' k and t = v k +i — ^k, we have \if — t\ = \v ' k+1 — v' k — {v k +i — Vk)\■ Then, two cases are 
possible: either tf = max(0, v k +t — v k ) or t = max(0, v' k + t' — v k ). So we have three different possibilities: 

■ if f + v k = t + v k , v k+1 = v k+1 , thus 1 1 ' - t\ = \v k - v' k \ = \v k - v' k \ - \v ' k+1 - v k+1 \. 

- if t = 0, then v k = v k+1 ^ v ' k+1 ^ v' k , thus \v ' k+1 -v k - K+i ~v k )\ = v ' k+1 - v' k = {v k - v' k ) - (: v k - v' k+1 ) = 

Wk - v' k \ - W k+ 1 - V k+l \. 

-if t' = 0, then v' k = v' k+x > v k+1 ^ v k , thus \v ' k+1 -v' k ~ (v k+1 -v k )\ = v k+1 -v k = {v' k - v k ) - {v' k - v k+1 ) = 
Wk - v' k \- y k+1 - v k+1 \. 

If S contains a reset, then v k+1 = v k+ \. If t' = v k + t — v k , we have that 1 1' — 1 1 = \v k — v' k \. Otherwise, either 
t = 0 and t' ^ v k — v k , or if = 0 and t ^ v k — v k - 

In all cases, we have proved Q. Coupled with the fact that \P{£ k )\ ^ II loc , we conclude that: 

Cost(p') < Cost{g{p')) + II loc (|jz — v'\ - \v k+1 - v' k+1 \). 
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5 



Figure 6 A PTG with 2 clocks whose value function is not continuous inside a region 


0 



Figure 7 An SPTG where Min needs memory to play optimally 


Now that er^| in and g are defined (noticing that g is stable by prefix, we extend naturally its definition to 
infinite plays), notice that for all play p' from (£, v') consistent with either p' does not reach a final location 
and its cost is +oo, but in this case g(p') has also cost +oo; or p' is finite. In this case let v' k be the clock 
valuation of its last configuration, and be the clock valuation of the last configuration of g(p'). Combining 
(Hi) and (iv) we have Cost(p') ^ Cost(g(//)) + II Ioc |^ — z/| which concludes the proof. 


B Non-continuity of the value function with more than one clock 


Let us consider the example in Figure [b] (that we describe informally since we did not properly define games 
with multiple clocks), with clocks x and y. One can easily check that, starting from a configuration (£o, 0,0.5) in 
location £ 0 and where x = 0 and y = 0.5, the following cycle can be taken: (£ 0 , 0, 0.5) 0 ’‘ ?0 - ’ 0 > (£i,0,0.5) °' 5 ’^ 1,2 ' - 5 > 
(£ 2 , 0.5,0) 2 ' - 5 > (£ 0 ,0,0.5), where S 0 , <5i and S 2 denote respectively the transitions from £ 0 to from 

£i to £ 2 ; and from £ 2 to £ 0 . Observe that the cost of this cycle is null, and that no other delays can be 
played, hence Val(£ 0 , 0,0.5) = 0. However, starting from a configuration (£ 0 ,0,0.6), and following the same 
path, yields the cycle (do, 0,0.6) °’ e °’°> (£i, 0,0.6) °' 4 ’ ei:2 > (£ 21 0.4,0) °-' 6,e2 ’ - 3 > (£ 0 , 0,0.6) with cost —1. Hence, 
Val(£ 0 , 0,0.6) = — 00 , and the function is not continuous although both valuations (0,0.5) and (0,0.6) are in 
the same region. Observe that this holds even for priced timed automata, since our example requires only one 
player. 


C Memory is required for Min to play optimally 

As an example, consider the SPTG of Figure [7] where IT is a positive integer, and every location has price- 
rate 0 : hence, this game can be seen as an (untimed) min-cost reachability game as studied in [ 121 . where it 
has been initially studied. We claim that the values of locations i\ and d 2 are both —IT. Indeed, consider the 
following strategy for Min: during each of the first W visits to £2 (if any), go to £ 1 ; else, go to £/. Clearly, this 
strategy ensures that the final location If will eventually be reached, and that either (!) transition (£ 1 ,^ 3 ) (with 
weight —IT) will eventually be traversed; or ( ii ) transition (£\,l 2 ) (with weight —1) will be traversed at least IT 
times. Hence, in all plays following this strategy, the cost will be at most —IT. This strategy allows Min to 
secure —IT, but he can not ensure a lower cost, since Max always has the opportunity to take the transition 
(£i,£/) (with weight —IT) instead of cycling between £1 and t 2 . Hence, Max’s optimal choice is to follow the 
transition (£1 ,£/) as soon as £1 is reached, securing a cost of —IT. The Min strategy we have just given is 
optimal, and there is no optimal memoryless strategy for Min. Indeed, always playing (i 2 ,lf) does not ensure 
a cost at most —IT; and, always playing (£ 2 ,£i) does not guarantee to reach the target, and this strategy has 
thus value + 00 . 


















18 Simple Priced Timed Games Are Not That Simple 


D Fake-optimality: proof of Lemma [ 3 ] 

First of all, notice that all finite plays p € Play(dMi n ) with all clock valuations in the same interval I of int(cr) 
verify Cost(p) < |/|II loc + |L|II tr — |p|/|L|. Indeed, the cost of p is the sum of the cost generated by staying in 
locations, which is bounded by |/|II loc , and the cost of the transitions. One can extract at least |p|/|L| cycles 
with transition prices as most —1 (by definition of an NC-strategy), and what remains is of size at most \L\, 
ensuring that the transition cost is bounded by |L|II tr — |p|/|L|. 

Then, by splitting runs among intervals of int(<TMin), we can easily obtain that all finite plays p £ Play(dMin) 
verify Cost(p) ^ II loc + (2|dMin| — 1) x |L|II tr — (|p| — |dM in |)/|L|. Indeed, letting ■ ■ ■ ,1k the interval of 

int(crMin) visited during p (with k ^ |dMin|), one can split p into k runs p = pi -4- p 2 • ■ • Pk such that in pi 
all clock values are in Ii (remember that SPTGs contain no reset transitions). By the previous inequality, we 
have Cost(pj) < |/j|II loc + |L|II tr — |pj|/|L|. Thus, also splitting costs Cj with respect to discrete cost and cost of 
delaying, we obtain Cost(p) = Y^=i Cost(pi) + Ei= 1 < (2|cr Min | - 1) x |L|II tr + II loc - (|p| - |d M in|)/|T, since 
\p\ < Ei \pi\ + k ^ Ei \Pi\ + l°Min| and \h\ < 1- 

We now turn to the proof of the lemma. To that extent, we suppose known an attractor strategy for Min, 
i.e., a strategy that ensures to reach a final location: it exists thanks to the hypothesis on the finiteness of the 
values. From every configuration, it reaches a final location with a cost bounded above by a given constant M. 
Notice first that, with the hypothesis that no configuration has a value — 00 in the SPTG we consider, it is 
not possible that fakeg Min (s) = —00 for a configuration s (i.e., that no runs of Play(s, dMin) reach the target). 
Indeed, consider the strategy d' Min obtained by playing <7Min until having computed a cost bounded above by a 
fixed integer N £ Z, in which case we switch to the attractor strategy. By the previous inequality, the switch 
is sure to happen since the right term tends to —00 when the length of p tends to 00. Then, we know that 
the value guaranteed by erj^ is at most N, implying that the optimal value Val(s) is —00, which contradicts 
the hypothesis. Then, to prove the result of the lemma, consider the strategy crj^ in obtained by playing dMin 
until having computed a cost bounded above by the finite value fakeg Min (s) — M, in which case we switch to 
the attractor strategy. Once again, the switch is sure to happen, implying that every play conforming to dMin 
reaches the target: moreover, the cost of such a play is necessarily at most fake CTMin (s) by construction. Then, 
we directly obtain that Valg Min (s) ^ fakeg Mi "(s). 

E SPTGs with only urgent locations: extended version of Section [ 3 ] 

We rely on the proofs of m that can easily be adapted in our case, even though we must give the whole 
explanation here, knowing that prices coming from goal functions can be rational, and hence do not strictly fall 
in the framework of m 

Since all locations in Q are urgent, we may extract from a play p = {£$,v) (£i,v) —4 ■ ■ ■ the clock 

valuations, as well as prices q = ir (£i,£ i+1 ), hence denoting plays by their sequence of locations T/i • • ■ • The 
cost of this play is Cost(p) = +00 if t\. £ Lf for all k ^ 0; and Cost(p) = Ei=o n (^i> U+ 1 ) + Pt k i. v ) if k is the 
least position such that £f. £ Lf. 

E.l Computing the value for a particular valuation 

Let us show how to compute the vector Val„ = (\/a\(£,u))e e L, for a given v £ [0,r], in terms of a sequence 
of values. Following the arguments of na, we first observe that locations £ with values Val l/ (£) = +00 and 
Val„(£) = —00 can be pre-computed (using respectively attractor and mean-payoff techniques) and removed 
from the game without changing the values of the other nodes. Then, because of the particular structure of the 
game Q (where a real cost is paid only on the target location, all other prices being integers), for all plays p, 
Cost(p) is a value from the set Z„ )¥ , = Z + {<pe(v) \ £ £ Lf}. We further define Z+“ = Z„ i(p U {+00}. Clearly, 
Z „ jV 3 contains at most \Lf\ values between two consecutive integers, i.e., 


( 5 ) 


Vi G Z | [i, i + 1] n | ^ |Lf | 

Then, we define an operator T\ (Z t&) L mapping every vector x 


(x e )e& L of (Z+“) L to 
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T(x) = (F(x)i)tzL defined by 


<Pt(v) 


if £ 

£ 

Lf 

max ( 

[tt{£,£') + Xf) 

if £ 

£ 

-^Max 

(£,£')eA 




min ( 

[tt(£, £') + xp) 

if £ 

£ 

-^Min 

CM')€A 




We will obtain Val„ as the limit of the sequence (x^)i^o defined by x^ = +oo if t fL Lf, and = ipg(v) if 
£ £ Lf, and then x W = ^(x^ 1 — 1 )) f or { ^ o. 

The intuition behind is that Xi is the value of the game (when the clock takes value v) if we impose that 
Min must reach the target within i steps (and get a payoff of +oo if it fails to do so). Formally, for a play 
P = £ 0^1 • • ■, we let Cost^*(p) = Cost(p) if £ k £ Lf for some k < i, and Cost^*(p) = +oo otherwise. We further 
let VaU (£) = inf CTM]n sup CT|v]ax Cost %I (Play((£, z/), <7M ax , <7Mj n )) (where ciMax and o'Min are respectively strategies of 
Max and Min). Lemma 1 of [T2] allows us to easily obtain that 

► Lemma 11. For all i ^ 0, and £ £ L: xf* = VaU*(£). 

Now, let us study how the sequence (Val^*)j^o behaves and converges to the finite values of the game. Using 
again the same arguments as in |12j {T is a monotonic and Scott-continuous operator over the complete lattice 
(Z+™) L , etc), the sequence (VaU ? ) q >o converges towards the greatest fixed point of T. Let us now show that 
Val„ is actually this greatest fixed point. First, Corollary 1 of m can be adapted to obtain 

► Lemma 12. For all £ £ L: VaU^ (£) ^ |L|II tr + II fin . 

The next step is to show that the values that can be computed along the sequence (still assuming that 
Val„(£) is finite for all £) are taken from a finite set: 


► Lemma 13. For all i ^ 0 and for all £ £ L: 

VaI^ |L|+i (£) € possVa^ = [_(|£| _ i) n tr - n fin , |L|n tr + n fin ] n z„, v 


where PossVal„ has cardinality hounded by \Lf \ x ((2|L| — l)n* 


2n 


fin 


!)■ 


Proof. Following the proof of [12} Lemma 3], it is easy to show that if Min can secure, from some vertex £, a 
cost less than —(|L| — l)II tr — II fin , i.e., Val(£, v) < —(|L| — l)II tr — n fin , then it can secure an arbitrarily small 
cost from that configuration, i.e., Val(£, zz) = —oo, which contradicts our hypothesis that the valu e is finite. 

Hence, for all i ^ 0, for all £: VaU f (£) ^ Va I „(£) > — (|L| — l)n tr — n fin . By Lemma 
sequence is non-increasing, we conclude that, for all i ^ 0 and for all £ £ L: 


12 


and since the 


—(|l| - i)n tr - n fin < Vai <|i|+J (f) < |L|n tr + n fin . 


Since all Val^' L ' +i (£) are also in Z v , v 
on the size of PossVal„ is established 


, we conclude that VaU^ + *(£) £ PossVaC for all i ^ 0. The upper bound 

by ([5|. ◄ 


This allows us to bound the number of iterations needed for the sequence to stabilise. The worst case is 
where all locations are assigned a value bounded below by — (|L| — l)n tr — n fin from the highest possible value 
where all vertices are assigned a value bounded above by |L|n tr + n fin , which is itself reached after \L\ steps. 
Hence: 

► Corollary 14. The sequence (VaU )i^o stabilises after a number of steps at most \Lf \ x \L\ x ((2|L| — l)n tr + 

2 n fln -(-i) + \l\. 

Finally, the proofs of m Lemma 4 and Corollary 2] allow us to conclude that this sequence converges 
towards the value Val„ of the game (when all values are finite), which proves that the value iteration scheme of 
Algorithm [l] computes exactly Val„ for all v £ [0,r]. Indeed, this algorithm also works when some values are 
not finite. As a corollary, we obtain a characterisation of the possible values of Q: 


► Corollary 15. For all r-SPTG Q with only urgent locations, for all location £ £ L and valuation v £ [0, r], 
Va I g(£,v) is contained in the set PossVal„ U {—oo, +oo} of cardinal 0(poly(|L|,n tr ,n fin )), pseudo-polynomial 
with respect to the size of Q. 
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Figure 8 Network of affine functions defined by Fg: functions in bold are final affine functions of Q, whereas non-bold 
ones are their translations with weights k £ [— (|L| — l)n tr , |L|n tr ] n Z. PossCPg is the set of abscisses of intersections 
points, represented by black disks. 


Finally, Section 3.4 and 3.5 of m explain how to compute simultaneously optimal strategies for both players. 
In our context, this allows us to obtain for every valuation v £ [0,r] and location £ of an r-SPTG, such that 
Val(T, v) ^ {—oo,+oo}, a memoryless optimal strategy for Max, and an optimal switching strategy for Min: a 
switching strategy is described by a pair (dM in ,% n ) of memoryless strategies and a switch threshold K, so that 
the optimal strategy is obtained by playing er^| in until the value of the current finite play is below K , in which 
case, we switch to strategy cr^, in , that can be taken as an attractor strategy, that only wants to reach a final 
location. 

E.2 Study of the complete value functions: Q is finitely optimal 

Still for an r-SPTG with only urgent locations, we now study a precise characterisation of the functions 
Val(f'): v £ [0,r] i —> Val(T, v), for all £, in particular showing that these are cost functions of CFp 0 r .]p 
We first define the set Fg of affine functions over [0, r] as follows: 

Fg = {k + m I e € Lf A k £ [~(\l\ - i)n tr , |i|n tr ] n z} 

Observe that this set is finite and that its cardinality is 2|L| 2 II tr , pseudo-polynomial in the size of Q. Moreover, 
as a direct consequence of Corollary |15| this set contains enough information to compute the value of the game 
in each possible valuation of the clock, in the following sense: 

► Lemma 16. For all £ £ L, for all v £ [0, r\: if\/a\(£, v) is finite, then there is f £ Fg such that Val(£, v) = f(v). 

We compute the set of intersections of two affine functions of Fg: 

PossCPg ={v £ [0,r] | 3 /i,/ 2 £ F g /r ^ f 2 A fi(v) = f 2 {v)} ■ 

This set is depicted in Figure [8] on an example. Observe that PossCPg contains at most |Fg| 2 points since 
all functions from Fg are affine, hence they can intersect at most once with every other function. Thus, the 
cardinality of PossCPg is 4|L/| 4 (II tr ) 2 , also bounded by a pseudo-polynomial in the size of Q. Moreover, since 
all functions of Fg take rational values in 0 and r G Q, we know that PossCPg C Q. This set contains all the 
cutpoints of the value function of Q, as shown in Proposition [4] 

Notice, that this result allows us to compute Val(£) for every i £ L. First, we compute the set PossCPg = 
{j/i, 2 / 2 ,..., ye}, which can be done in pseudo-polynomial time in the size of Q. Then, for all 1 ^ i ^ £, we can 
compute the vectors (Val(T, Vi)) (eL of values in each location when the clock takes value yt using Algorithm 111 
This provides the value of Val(f?) in each cutpoint, for all locations £, which is sufficient to characterise the whole 
value function, as it is continuous and piecewise affine. Observe that all cutpoints, and values in the cutpoints, 
in the value function are rational numbers, so Algorithm |T| is effective. Thanks to the above discussions, this 
procedure consists in a pseudo-polynomial number of calls to a pseudo-polynomial algorithm, hence, it runs in 
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pseudo-polynomial time. This allows us to conclude that Val g(£) is a cost function for all £. This proves item 
(iii) of the definition of finite optimality for SPTGs with only urgent locations 

Let us conclude the proof that SPTGs are finitely optimal by showing that Min has a fake-optimal NC- 
strategy, and Max has an optimal FP-strategy. Let p\, V 2 ,--.,Vk be the sequence of elements from PossCPg in 
increasing order, and let us assume pq = 0. For all 0 ^ i ^ k let be the function from Fg that defines the 
piece of Val s (£) in the interval (we have shown above that such an f£ always exists). Formally, for all 

0 < i < k, f[ G Fg verifies Val (£, v) = f£(p), for all v G [pf_ 1 ,v£\. Next, for all 1 ^ i < k, let /q be a value taken 
in the middle of [i/,_i, p(\, i.e., pt = Note that all pf s are rational values since all pf s are. By applying 

solvelnstant in each pi, we can compute (Valg(£, Hi))e^L, and we can extract an optimal memoryless strategy 
°Max f° r Max and an optimal switching strategy for Min. Thus we know that, for all £ G L, playing a l Um 
(respectively, cr^ ax ) from {£, p,) allows Min (respectively, Max) to ensure a cost at most (respectively, at least) 
Valg(€, pf) = fi(fJ-i)- However, it is easy to check that the bound given by fi(pi) holds in every valuation, i.e., 
for all l, for all v 

Val g Min (£, p) < f-(p ) and Valg Max (£, v) ^ f-(p ) . 

This holds because: (i) Min can play (T^ in from all clock valuations (in [0, r]) since we are considering an r-SPTG; 
and (ii) Max does not have more possible strategies from an arbitrary valuation p G [0, r] than from pi, because 
all locations are urgent and time can not elapse (neither from p, nor from pf). And symmetrically for Max. 

We conclude that Min can consistently play the same strategy cr l Wm from all configurations (£,p) with 
p G [Pi-i,Pi] and secure a cost which is at most f( (p) = Val g[£,p), i.e., is optimal on this interval. 
By definition of cr^ in , it is easy to extract from it a fake-optimal NC-strategy (actually, fr^ in is a switching 
strategy described by a pair (tr^ in ,o-^ in ), and cr^ in can be used to obtain the fake-optimal NC-strategy). The 
same reasoning applies to strategies of Max and we conclude that Max has an optimal FP-strategy. 

F Every SPTG is finitely optimal 

We start with an auxiliary lemma showing a property of the rates of change of the value functions associated 
to non-urgent locations 

► Lemma 17. Let Q be an r-SPTG, £ and £' be non-urgent locations o/Min and Max, respectively. Then for all 
0 ^p <p' ^r: 

Va'^^Val^) ^ _*(/) V^X)— ^ ^ . 

p' — p p’ — p 

Proof. For the location £, the inequality rewrites in 

Valg(f, p) < (: u’ - p)n(£) + Val e (£, i /). 

Using the upper definition of the value (thanks to the determinacy result of Theorem Q, it suffices to prove, 
for all e > 0, the existence of a strategy <7Min such that for all strategies dMax of the opponent 

Cost(Play((£,r/),crMin,o-Max)) ^ (y' ~ p)k{£) + Val g{£,p') +e. 

The definition of the value implies the existence of a strategy a' M]n such that for all strategies ctm 3 x 

Cost(Play((£, p'),cr' Mm ,o M ^)) < Val g(£,t/) + e. 

Then, crMin can be obtained by playing from (£,p), at the first turn, as prescribed by <7^ in but delaying p' — p 
time units more (that we are allowed to do since £ is non-urgent), and, for other turns, directly like A 

similar reasoning allows us to obtain the result for £'. < 

Then, we observe that the construction of Q, does not alter the value of the game: 

► Lemma 18. For all p G [0,r] and locations £. Val g(£,p) = \/a\g r (£,p). 

Now, we turn our attention to the construction of Ql' ,r- We show that, even if the locations in L' are turned 
into urgent locations, we may still obtain for them a similar result of the rates of change as the one of Lemma [TT] 

► Lemma 19. For all locations £ G L' D (respectively , £ G L' fl L Max/, and p G [0,r], Valg^, r {£,p) ^ 
(r — p)tt(£) + \Za\g(£,i') (respectively, \Ja\g L , r (£,p) ^ (r ^ p)n(£) + Va\g(£,r) ). 

Proof. It suffices to notice that from (£,p), Min (respectively, Max) may choose to go directly in ensuring 
the value (r — p)tt(£) + Val e (£, r). ◄ 
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F.l Proof of Proposition [5] 

Let rriv]i n and <JMax be a fake-optimal NC-strategy of Min and an optimal FP-strategy of Max in GL'u{e},r, 
respectively. Notice that both strategies are also well-defined finite positional strategies in Gu,r- 

First, let us show that UMin is indeed an NC-strategy in Gl 1 ,r- Take a finite play {£o, isfi) -A • • • k l > {£k, vk ), 
of length k ^ 2, that conforms with CTMin in Gu,ri and with £q = A and isq, Vk in the same interval I of int(<TMm). 
For every £y that is in Lw m , and is € I, <Jy\m {£%, is) must have a 0 delay, otherwise isk would not be in the same 

interval as is 0 . Thus, the play {£o,vq) -A ••• —-— A {£k,iso) also conforms with CTMin (with possibly different 
costs). Furthermore, as all the delays are 0 we are sure that this play is also a valid play in GL'u{i},r , in which 
om in is an NC-strategy. Therefore, 7t{£ 0 ,£i) + • • • + 7r(A-i, A) < —1, and (7 M i„ is an NC-strategy in Gl' ,r- 

We now show the result for £ £ iMin- The proof for £ £ L m 3X is a straightforward adaptation. Notice that 
every play in Gl',t that conforms with is also a play in GrsuUl.r that conforms with as is defined 
in Gl' u{e},r an d thus plays with no delay in location £. Thus, for all is £ [a,r] and £' £ L, by the optimality 
result of Lemma [3] 


Vala tv (^, is) < fake^; r (/, is) = fake^^ is) = Vale £ , u{e} , r (A is). 


( 6 ) 


To obtain that Vale £/ r {£', is) = Valg i/u{{} r {£’, is), it remains to show the reverse inequality. To that extent, 
let p be a finite play in Gu,r that conforms with o Max? starts in a configuration ( £!,v ) with is £ [a,r], and ends 
in a final location. We show by induction on the length of p that Cost(p) ^ Val g L , u{(} r {£',is). If p has size 1 
then £! is a final configuration and Cost(p) = Val g L , u{e} r (£', is) = <p>\,(y). 

Otherwise p = (£!,is) A p' where p' is a run that conforms with cjMax> starting in a configuration {£" ,v") 
and ending in a final configuration. By induction hypothesis, we have Cost(/V) ^ We now 

distinguish three cases, the two first being immediate: 


h If £! £ Z/Max} then aus X (A is) leads to the next configuration {£", is"), thus 


Val e 




(A A = Val^“ {<} Jt, is) = c + Val£“ m J£",is") < c+ Cost(p') 


u {<>,> 


Cost(p). 


h If £' £ LMin, and £' ^ £ or is" = v, we have that {£' ,is) A {£",is") is a valid transition in Q'. Therefore, 
Val ^'u W ,r( f <*') ^ c + V*\g L , UW r { 1 "’ v ")’ hence 

Cost(p) = c + Cost(p') > c + Va\g L , u{e} r (£",is") > Val g L , uUhr (£',is). 


b Finally, if £' = £ and is" > is , then c = ( is” — is)tt{£) + n{£, £"). As {£, is") \ {£", is") is a valid transition 
in Ql' u{e},r, we have Val g L , u{e} r {£,is") < n{£,£") + Val g t , u{e} >v")- Furthermore, since v" £ [a, r], we 
can use (Jl]| to obtain 


Vale 


L'\j{£},r 


{£, is) < Vale 


r.'u{n. 


\£, is") + {v" - is)n{£) < Vale 


L'uiey, 


X£",is")+ir(£, £") + {is" - is)n(£). 


Therefore 


Cost(p) 


= {is" — is)n{£) + 7 t{£,£") + Cost(p') 

> K - AAO + n{£,£") + Vale £ , UWir (*", is") > Vale L/uw>r (^, v). 


This concludes the induction. As a consequence, 


inf 

eStrat Min (5 £ 


CoSt 6i , r ( Pla V((A A>°Miru CT Max)) ^ Val 5 


L'um, 


.(*» 


for all locations £' and v £ [a,r], which finally proves that Vale £/ r {£',is) ^ Vale,, {<} r iX' > A- Fake-optimality 
of dMin over [a, r] in Gi/\j{t},r is then obtained by (|g|. 


F.2 Proof that left(r) < r 

This lemma allows us to effectively compute left(r): 

► Lemma 20. Let G be an SPTG, L' C L\L U , and r £ (0,1], such that Gl" ,r is finitely optimal for all L" C L'. 
Then, leftLXr) is the minimal valuation such that for all locations £ £ L' fl LMin (respectively, £ £ L' (~l Lyj\ax), 
the slopes of the affine sections of the cost function Vale £ , {£) on [left{r),r] are at least (respectively, at most) 
—7 t{£). Moreover, left{r) < r. 
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17 


states that all the slopes 


Proof. Since Valg L , (£) = Valg(£) on [left(r), r], and as £ is non-urgent in Q, Lemma 
of Valg(£) are at least (respectively, at most) —7 r(£) on [left(r),r-]. 

We now show the minimality property by contradiction. Therefore, let r' < left(r) such that all cost functions 
\Ja\g L , (£) are affine on [/, left(r)], and assume that for all £ G Z/nZ/Min (respectively, £ € L'flLMax), the slopes 
of Valg i( r (£) on [r 7 , left(r)] are at least (respectively, at most) — tt(£). Hence, this property holds on [r',r]. Then, 
by applying Proposition [5] \L'\ times (here, we use the finite optimality of the games Gl",t with L" C L'), one 
can show that for all v G [r',r] \/a\g r (£,v) = Val^, r (£,v). Using Lemma 18 we also know that for all v ^ r, 
and £, \Ja\g r (£, v) = \Za\g(£,u). Thus, \Ja\g r L ,{l,v) = \Ja\g{£, v). As r' < left(r), this contradicts the definition 
of leftjy(7'). 

We finally prove that left(r) < r. This is immediate in case left(r) = 0, since r > 0. Otherwise, from the 
result obtained previously, we know that there exists r' < left(r), and £* G L' such that Val g , (£*) is affine on 
[r' , left(r)] of slope smaller (respectively, greater) than — 7 t(£*) if l* G Ly\\n (respectively, £* G Lm 3 x), i.e., 

f Val ei , r (£*, r') > Val g L , r (£*, left(r)) + (left(r) - r')n(£*) if £* G 

|Val g L , r {£*,r') < Ma\g L , r (£*, left(r)) + (left(r) - r')n(£*) if £* G L Max . 

From Lemma |19[ we also know that 

fVal g L , r (t,r') ^ \Za\g L , r {t 7 r) + (r - r')n{t) if t G L M]n 

^ Val 5 i,/,,(^*u) + (r - r') tt(£*) if £* G L Max . 

Both equations combined imply 

f Val £h/, r (^V) > Val ar,v(^*> left ( r )) + (left(r) - r)7r(£*) if £* G L Mm 
1 Val St( ( £*,r ) < Valg^, (£*, left(r)) + (left(r) - r)7r(£*) if £* G L Max 


which is not possible if left(r) = r. 


F.3 Pieces of the value functions are segments of F g 

► Lemma 21. Assume that Qi*^ r is finitely optimal. If\fa\g t „ r (£*) is affine on a non-singleton interval I C [0,r] 
with a slope greate^than — 7r(£*), then there exists f G Fg such that for all v G I, \Ja\g (i , r (£* 7 v) = f(v). 

Proof. Let au\n and ctm 3 x be some fake-optimal NC-strategy and optimal FP-strategy in Ge*, r - As Z is a non¬ 
singleton interval, there exists a subinterval I' C Z, which is not a singleton and is contained in a interval of 
^Min and of (7|vi ax . 

Let v G V. As already noticed in the proof of Lemmaj8] the play Play((£*, v) , erMi n , OMax) necessarily reaches 
a final location and has cost Val^* r (£*, v). Let (£q, vq) —• • • (£*,, Vk) be its prefix until the first final location 
tk (the prefix used to compute the cost of the play). We also let v' G 7' be a valuation such that v < v'. 

Assume by contradiction that there exists an index i such that v < v t and let i be the smallest of such 
indices. For each j < i, if £j G LMin, let (t,S) = dMin^,^) and ( t',5 ') = (JM\ n {£j,v'). Similarly, if £j G LMax, 
we let (i, 5) = <TMax(£j, z') and ( t',5 ') = <7Max(£j, v'). As Z' is contained in an interval of CMin and CMax, we 
have S = S' and either t = t' = 0, or v + t = u' + t'. Applying this result for all j < i, we obtain that 

(£o, v 1 ) —• • • (£i- 1 , v') (£,;, vf) • • • (£) t, Vk) is a prefix of Play((£*, u'), (TMin, OMax): notice moreover that, 

as before, this prefix has cost Val^, r (£*,z/). In particular, 

Val g e +,M*,v') = Valg^* r (£*, v) - (y' - ^)7r(£j_i) < Val 5ft , u) - {v' - v)ir{t) 

which implies that the slope of Valg^ ,.(£*) most —7r(£*), and therefore contradicts the hypothesis. As a 
consequence, we have that !/, = v for all i. 

Again by contradiction, assume now that £k = £* for some £ G L \ L u . By the same reasoning as before, we 
then would have Valg ( * r (£*, v') = Valg f * r (£*, v) — (v' — v)-k{£), which again contradicts the hypothesis. 

Therefore, £k G Lf. If we let w = 7r(£ 0 ,£i) + • • • + Tr(£k-i,£k), we have Val^* r (£*,v) = w + ipe k (v). Since 
o Min and o Max are FP-strategies, that play constantly in valuation iq we know that (£q, v) —4 • • • (£k, v) has no 


5 For this result, the order does not depend on the owner of the location, but rather depends on the fact that l * has minimal 
price amongst locations of Q. 
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Val c (**,i/) 



Figure 9 The case l * £ L Max: a geometric proof of fi ^ /,. The dotted lines represents /; and fj, the dashed lines 
have slope — 7 r (•£*), and the plain line depicts Val g(£*, ■). Because the slope of /; is strictly smaller than — ir(£*), and the 
value at r,- is above the dashed line it can not be the case that fi{rj) = Val g(£*,rj) = fj{rj). 


cycle, therefore w £ [— (|L| — l)II tr , |L|II tr ] fl Z. Notice that the previous developments also show that for all 
v' £ /' (here, v < v' is not needed), Val^* \£* ,v') = w + <pe k (v'), with the same location £^, and weight k. 
Since this equality holds on /' C I which is not a singleton, and Valg f , r (£*) is affine on /, it holds everywhere 
on I. < 

F.4 Proof of Lemma [6] 

For the first item, we assume £* £ Z^Min, since the proof of the other case only differ with respect to the sense 


of the inequalities. From Lemma 20 we know that in Ge*,n there exists r' < j-j+i such that Valg f * r .{£*) is 


affine of [r',ri + 1 ] and its slope is smaller that — n(£■*), i.e., Valg ft r (£*,n + 1) < Valg^ r .(r') — (?y + i — r')n(£*). 

also ensures that Valg f * r (£*,r') < Val g{£*,rf) + (r; — r')n(£*). Combining both inequalities allows 


19 


Lemma 


us to conclude. 


We now turn to the proof of the second item, showing the stationarity of sequence (rj). We consider first the 
case where £* £ Lm 3X . Let i > 0 such that rj 0 (if there exist no such i then rq = 0). Recall from Lemma 20 


that there exists r[ < r t such that Valg f „ r . i (£*) is affine on [r-, rj], of slope greater than —n(£*). In particular, 
Valef*, ri l (^*, r i) — Val^* ri i (£*, r') 


n - r 


> —7 t(£*) . 


Lemma [211 states that on [r',rj], Valg^* r (£*) is equal to some fi £ Fg. As fi is an affine function, /i(r,;) = 
Val s<* r ~{e*,ri), and /j(r') = Val a< * r . (C,r'), for all v , 


fi(v) =\Za\g etrii (£*,ri) + 


Va^^^.rO-Val^^^.rO 


(n - ")■ 


Since Ge.*,n-i is assumed to be finitely optimal, we know that Valg f , r . {£* ,rf) = Valg(^*,rj), by definition of 
Vi = leftq*(rj_i). Therefore, for all valuation v < n, we have fi(v) < Val g{£*,rf) + Tr(£*)(n — u). 

Consider then j > i such that r ? - 0. We claim that fj ^ fi. Indeed, we have Val g(£*,rj) = fj(rj). As, 


in G, £* is a non-urgent location, Lemma 17 ensures that (*): Val g(£*,rj) ^ Val g(£*,ri) + Tr(£*)(n — rj). As 


for all Valg(£*,r.j') = /j'(rj'), (*) is equivalent to fj(rj ) ^ fi(vi) + 7r(£*)(r* — rj). Recall that fi has a slope 
strictly greater that —tt(£*), therefore fi(rj) < /j(rj) + 7r(f?*)(rj — rj) < fjifj )• As a consequence fi ^ fj (this 
is depicted in Figure [9]). 

Therefore, there can not be more than |Fg| + 1 non-null elements in the sequence r 0 ^ ri ^ • • •, which 
proves that there exists i ^ |Fg| + 2 such that r j = 0. 

We continue with the case where £* £ Lu\n- Let r^, = inf{rj | i ^ 0}. In this case, we look at the affine parts 
of Valg(£*) with a slope greater than —n(£*), and we show that there can only be finitely many such segment 
in [roo, 1]. We then show that there is at least one such segment contained in [rj + i,rj] for all i, bounding the 
size of the sequence. 
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Figure 10 In order for the segments [a, b ] and [c, d] to be aligned, there must exist a segment with a biggest slope 
crossing f^a, b\ (represented by a dashed line) between b and c. 


In the following, we call segment every interval [a, b] C (/■ 00 ,1] such that a and 6, are two consecutive 
cutpoints of the cost function Valg(£*) over (^,1]. Recall that it means that Valg(£*) is affine on [a, b], and 
if we let a' be the greatest cutpoint smaller than a, and b' the smallest cutpoint greater than b , the slopes of 
Valg(£*) on [a', a] and [ b,b'} are different from the slope on [a, b\. We abuse the notations by referring to the 
slope of a segment [a,b} for the slope of Valg(£*) on [a, 6] and simply call cutpoint a cutpoint of \/a\g{£*). 

To every segment [a, &] with a slope greater than —7 t(£*), we associate a function £ Fg as follows. Let i 
be the smallest index such that [a, b] fl [r^+i, r^] is a non singleton interval [a', b']. Lemma 21 ensures that there 
exists £ Fg such that for all v £ [a',b'], \Za\g(£*,v) = f[ a ,b\{v). 

Consider now two disjoint segments [a, b] and [c, d} with a slope strictly greater than —7 r(£*), and assume that 
f[ a ,b] = f[c,d] (i n particular both segments have the same slope). Without loss of generality, assume that b < c. 
We claim that there exists a segment [e, g] in-between [a, b] and [c, d] with a slope greater than the slope of [c, d], 
and that f[ e , g ] and f[ a ,b] intersect over [6, c], in a point of abscisse x, i.e., x £ [b, c] verifies f[ e , g ]{x) = f[ a ,b]( x ) 
(depicted in Figure |TTjj). 

Let a be the greatest cutpoint smaller than c. We know that the slope of [a, c] is different from the one of [c, d]. 
If it is greater then define e = a and x = g = c, those indeed satisfy the property. If the slope of [a, c] is smaller 
than the one of [c, d], then for all v £ [a, c), V a\g{£*, v) > /[ Ci d](^)- Let x be the greatest point in [6, a] such that 
V a\g((*,x) = f[ c ,d]i x )- We know that it exists since Val g{£*,b) = /[ c ,d](fr), and \/a\g(£*) is continuous. Observe 
that Val g(£*, v) > /[ Cj dl(T)> for all x < v < c. Finally, let g be the smallest cutpoint of Valg(^*) strictly greater 
than x, and e the greatest cutpoint of \/a\g(£*) smaller than or equal to x. By construction [e, g) is a segment 
that contains x. The slope of the segment [e, g] is S[ 6j£ ,] = Valg ( £ ’ 9 ^~^ alg C , x ) , an( j t j le s i 0 p e 0 f the segment [c, d] 

is equal to S[ c>d ] = . Remembering that \Ja\g{t,x) = f[ c ,d\(x), and that \Ja\g{t,g) > f[ c ,d]{g) 

since g £ ( x,c ), we obtain that S[ ejS ] > S[ C)£ q. Finally, since Val g(£*,x) = f[ c ,d]( x ) = f[e, g ]( x )> it is indeed the 
abscisse of the intersection point of f[ c ,d\ = f\ a ,b] and f[ e g i, which concludes the proof of the previous claim. 

For every function / £ Fg, there are less than |Fg| intersection points between / and the other functions of 
Fg (at most one for each pair (/, /')). If / has a slope greater than —7r(F*), thanks to the previous paragraph, 
we know that there are at most |Fg| segments [a, b] such that f[ a ,b] = /■ Summing over all possible functions /, 
there are at most |Fg| 2 segments with a slope greater than — 7r(£*). 

Now, we link those segments with the valuations rf s, for i > 0. By item (i), thanks to the finite-optimality 
of Ge*,n, Valg(£*,ri+i) < (Ti — r.; + i)7r(£*) +\Ja\g{£*,n). Furthermore, Lemma [8] states that the slope of the 
segment directly on the left of ri is equal to —7r(£*). With the previous inequality in mind, this can not be the 
case if Valg(£*) is affine over the whole interval [r,; + i, r^]. Thus, there exists a segment [a, 6] of slope strictly 
greater than —7r(^*) such that b £ [rj_|_i, r^]. As we also know that the slope left to r,;+i is —7 r(£*), it must be 
the case that a £ [?y+i,ri]. Hence, we have shown that in-between ?y + i and r*, there is always a segment (this 
is depicted in Figure 111. As the number of such segments is bounded by \Fg\ 2 , we know that the sequence rj 
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Val a (r,z/) 



Figure 11 The case i * £ Lu\n- as the value at rj+i is strictly below Valg(ri) + 7r(£*)(ri —n+i), as the slope on the left 
of r; and of r-i+i is —7r ((*), there must exist a segment (represented with a double line) with slope greater than 
in [r i+1 ,ri). 


is stationary in at most |Fg| 2 + 1 steps, i.e., that there exists * ^ |Fg| 2 + 1 such that r* = 0. 


F.5 Proof of Lemma [8] 


We denote by r' the smallest valuation (smaller than ri) such that for all locations £, Valp(£) is affine over \r', n]. 
Then, the proof goes by contradiction: using Lemma [20j we assume that for all <ei'n Liviin (respectively, 

£ g L' n Lmbx) ■ 


i either (->(i)) the slope of Valg(f) on [An] is greater (respectively, smaller) than — 7r(£), 
— or ((f) A ->(m)) for all v £ [r',n], Valg;(£, v) = Val g(£,r 0 ) + 7r (£)(r 0 - v). 


Let cr^ in and cr^ ax (respectively, cr^ in and <r^ ax ) be a fake-optimal NC-strategy and an optimal FP-strategy 
in <?L',r 0 (respectively, Q L vJ- Let r" = max(pts((T^ in ) U pts(cr^, ax )) n [r',n), so that strategies o^ in and a^ ax 
have the same behaviour on all valuations of the interval (r",ri), i.e., either always play urgently the same 
transition, or wait, in a non-urgent location, until reaching some valuation greater than or equal to r\ and then 
play the same transition. 

Observe preliminarily that for all i £ L' fl Livim (respectively, £ £ L' C I Z^Max), if on the interval (r",ri), tj^ in 
(respectively, cr^ ax ) goes to £f then the slope on [r",n] (and thus on [An]) is — 7 t(£). Thus for such a location 
£, we know that ( i ) A holds for £ (by letting r' be r"). 

For other locations £, we will construct a new pair of NC- and FP-strategies <TMin and ctmbx in f?L',r 0 such 
that for all locations £ and valuations v £ (r",r i) 

fake2 Min {£, v) < Val g(£, v) Val^ (£, v) . (7) 

y L',r 0 ^L',r 0 


As a consequence, with Lemma [3] (over game Gl',t 0 ), one would have that Valp ( , ^ (£, v) = Valg(£, v), which 


will raise a contradiction with the definition of ri as leftnAo) < tq (by Lemma 20), and conclude the proof. 

We only show the construction for fjMin, as it is very similar for o^ax- Strategy CTMin is obtained by combining 
strategies cr^ in over [0,n]? and cr^| in over [ri,ro]: a special care has to be spent in case cr^ in performs a jump to 
a location 1 *, since then, in <7Minj we rather glue this move with the decision of strategy crjj| in in (£, ri). Formally, 
let (£,u) be a configuration of Gu,r 0 with £ £ iMin- We construct (Jmn{£,v) as follows: 

- if v > r-L, a Min (£, v) = CT° in (£, v)\ 

h if v < n, £ qL L' and cr^ in (£, v) = (t, {£,&)) for some delay t (such that v + t < n)> we let cr Mm (£, v) = 
(n-v +1', (£, £’)) where (t', (£, £')) = cr^ in (£, n); 

- otherwise a Min (£, v) = cr^i, v). 


For all finite plays p in Gu,r 0 that conform to o'Min, start in a configuration (£,v) such that v £ (r",r$\ and 
t (j {£'* | £' £ L }, and end in a final location, we show by induction that Costp^, (p) ^ ValgA v). Note that 
p either only contains valuations in [n, ro], or is of the form (£, v) A (A, ;/), or is of the form {£, v) A p' with 
p' a run that satisfies the above restriction. 
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i If v £ [ri,ro], then p conforms with (T^ in , thus, as cr^ in is fake-optimal, Cost g L , ro (p) ^ Valg^, (£, v) = 
Valg(£, v) (the last inequality comes from the definition of r\ = left^/Ao)). Therefore, in the following cases, 
we assume that v £ (r",ri). 

h Consider then the case where p is of the form (£, v) -A (A, v'). 


_ if £ £ -L'niMin, £ is urgent in Gl' >ro , thus v' = v. Furthermore, since p conforms with erMim by construction 
of UMin, the choice of cr^ in on (r", r i) consists in going to A, thus, as observed above, (i ) A ->(**) holds for 
£. Therefore, Valg(f, v) = Val g(£,r 0 ) + 7r (£)(r 0 - v) = ipe f (v) = Costg^, ^ (p). 
m If £ £ ^Min \ L' , by construction, it must be the case that v) = (ri — v + t',(£, A)) where 

(t,(£,£f)) = &l/\ ln (£, v) and (t',(£, £*)) = a^ n {£,r{). Thus, v' = n + t'. In particular, observe that 

Cost g L , rQ (p) = (ri — v)t:{£) + Cost g L , (p?) where p' = (£, n) A (£f,i/). As p' conforms with ^Min 
which is fake-optimal in GL',r 0 , Cost g L , rQ {p') < Valg i7 (£, n) = Valg(£,ri) (since n = left(ro)). Thus 


Costg^, (p) < (?"i — i/)7r(^) + Val e (£, ri) = Cost g L , ri {p") where p" = (£,v) A- {£* ,v + t) conforms 
with CT^ in which is fake-optimal in Gl>, ri - Therefore, Costg^, r (p) ^ Valg i( ^ (£,v) = Ma\g(£,v) (since 
ri = left(ro)). 

If £ £ Am ax then Cost g L , ro (p) = W - y)ir{£) + pt f (y') = (v 1 - v)n(£) + (r 0 - v')ir(£) + Val g(£,r 0 ) = 
(ro — is)7r(£) + Valg(£,ro)- By Lemma 17 since £ £ I^Max \ A {£ is not urgent in Q since A exists), 
Valg(£, n) ^ (r 0 — ri)n(£) + Val g(£,r 0 )- Furthermore, observe that if we define p' as the play (£,v) —A 
(£*,v) in Gl', n ? then p' conforms with (T^ in and 


Cost g L , ri (p') = (ri - v)tt{£) + Valg(£, n) 

> (n - u)n(£) + (r 0 - n) n(£) + Valg (£, r 0 ) 

= (r 0 - v)n(£) + Valg (£, r 0 ) 

= Costg L , ro (p). 

Thus, as <Tj^ in is fake-optimal in Gu^n , Costg i( (p) < Costg^, ri (p') ^ Valg^, ri (f, i/) = Valg(£, v). 

m We finally consider the case where p = (£,v) A p' with p' that starts in configuration [£! ,v') such that 
£' ^ {t" s | £" £ L}. By induction hypothesis Costg t , rQ {p') ^ Valg(£', v'). 

m If v' < n, let p" be the play of Gl< starting in (£', v') that conforms with cr^ and cr^ ax . If p" does not 
reach a final location, since cr^ in is an NC-strategy, the costs of its prefixes tend to — oo. By considering the 
strategy cr' Mln of Lemma|3j we would obtain a run conforming with o-^ ax of cost smaller than Valg^, ^ {£', v') 
which would contradict the optimality of cr^ ax . Hence, p" reaches the target. Moreover, since fr^ ax is 
optimal and cr^ in is fake-optimal, we finally know that Costg^, (p") = Valg^, (£', v') = Valg(Z",7/) 
(since v' £ [left(ri),n]). Therefore, 


Costg^, rQ (p) = (z/ - v)-k{£) + t t(£, £') + Costg^, ^ (p') 

< (v' — v)n{£) + n(£, £') + Valg(f, v') 

= (i/ — v)t:{£) + 7t(£, £') + Cost(p") = Cost((Z?, v) -A p") 


/ / 

Since the play (£,v) -A p" conforms with cr^ in , we finally have Costg^, r (p) < Cost((£, z/) A p") < 

Valg i , ij . i (^i/) = Valg(^,i/). 

- If v' > n and £ £ Lm 3X , let p 1 be the play in Gr/,r, defined by p 1 = (£, v) A (A, v) and p° the play in 
Gl'^o defined by p° = (£, n) -A p'. We have 

Cost 5i,'. ro A) = A - iy ) 7r A) + 7r ( £ - + Cost 6z,/,ro (^) 

= <Pi s { v ) — Valg(£,ri) + (A — ri)n(£) +tt(£,£') + Costg L , ro (p') . 

= CoSt 5i.',. 1 ( pl) = Co5t Si./,. 0 (P 0 ) 

Since p° conforms with <7^, fake-optimal, and reaches a final location, Costg t , r _ i (p°) ^ Valg t , (£,ri) = 
Valg(£, n) (since ri = lefti,/(ro)). We also have that p 1 conforms with cr^ in , so the previous explanations 
already proved that Costg t , ri (p 1 ) ^ Valg(£, v). As a consequence Costg^, (p) < Valg(£, v). 




28 Simple Priced Timed Games Are Not That Simple 


1 1 

0 4 2 1 




V al(£2,x) i i 3 _9_ 

0 4 2 4 10 1 



Figure 12 Value functions of the SPTG of Figure [l] 


-If v' > n and £ £ Ljviin, we know that £ is non-urgent, so that £ ^ L'. Therefore, by definition of OMim 
<TMm{£,v) = (ri — v + t', {£,£')) where cr^, in (^, v) = (t, (£,£ f )) for some delay t, and <T^ n (£,n) = (t', {£,£')). 

If we let p 1 be the play in t/z/,n defined by p 1 = {t,v) —f {£*,v) and p° the play in Qu,r 0 defined by 
p° = (£,ri) p', as in the previous case, we obtain that Costg t , (p) ^ Valg(£, v). 

As a consequence of this induction, we have shown that for all £ £ L, and for all v £ (r", ri), fake2 Min (£, v) ^ 

0 y L',r o 

, the other being obtained very similarly. 


Figure [12] shows the value functions of the SPTG of Figure |T] Here is how the algorithm obtains those functions. 
First it computes the functions at valuation 1, thanks to solvelnstant. Then, it computes the value of the 
game where all states are urgent but additional terminal states have been added by the wait function to allow 
waiting until 1. This step gives the correct value functions until the cutpoint in the repeat loop, at first 
a = 9/10 but the slope in l\ is smaller than the slope that would be granted by waiting. Then a = 3/4 where 
the algorithm gives a slope of value —16 in £2 while the cost of this Max’s location is —14. We thus choose 
r := 3/4 and compute the algorithm on the interval [0,r] with final states allowing one to wait until r and get 
the already known value in r. The algorithm then stops at ^ in order to allow l\ to wait, then in / because of 
£2 and finally the algorithm reaches 0 giving us the value functions on the entire interval [0,1]. 


Towards solving reset-acyclic PTGs, our first step is to remove strict guards from the transitions, i.e., guards 
of the form (a, b], [b,a ) or (a, b) with a, b £ N. For this, we enhance the PTG with regions in a method similar 
to what is done in [13 Lemma 4.6]. Formally, let Q = (LMin, Twiax, Lf, L u , A, n) be a PTG. We define the 
region-PTG of Q as Q' = (I/ Min , L' Max , L' f , L' u , <p\ A', n') where: 


H Reset-acyclic PTGs 


G Run of the algorithm on an example 
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^Min — {{£>!) I ^ G LM in ,I £ Reg e }; 
“ -^Max = {(^>-0 I ^ G Lfj\ ax ,I £ Reg e }; 

- L f = {{£,T)\£eL f ,IeReg g }-, 

™ L u = {(^, I) | i G L u , I £ Regg}; 

- Vfy,/) £ L' f , tp' ( j = ifit, 


A' = { ((£, I),I g n i, i?, (£', /')) I (*, 4, ii, n G A, I' 


f I if R = _L 
1 {0} otherwise 


u{((£,(M k ,M k+1 )),{M k+1 },±,(£,{M k+1 }))\t£L,(M k ,M k+1 ) £ Reg e } 
U{((£,{M k }),{M k },±,(£,(M k ,M k+1 ))) \t£L,(M k ,M k+1 )£Reg g }-, 


- V(f,7) e L',n'{£,I) = tt(7); and V((7, /), J a , i?, (f , /')) € A', if (£, I g , R,? ) e A, then tt((7, /), / fl , i?, (f , /')) = 
7T (£,I g ,R,£), else n((£,I),I g ,R, {£',!')) = 0. 


It is easy to verify that, in all configurations ((7, {Afy}), v) reachable from the null valuation, the valuation 
v is Mfc. More interestingly, in all configurations {{£, (Mk, u) reachable from the null valuation, the 

valuation v is in [Mk, Mk+i\- indeed if v = Mk (respectively, Mk+i), it intuitively simulates a configuration 
of the original game with a valuation arbitrarily close to Mk, but greater than Mk (respectively, smaller than 
Mk+\). The game can thus take transitions with guard x > Mk, but can not take transitions with guard x = Mk 
anymore. 


► Lemma 22. Let Q be a one-clock PTG, and Q' be its region- PTG defined as before. For (£,I) £ Lx Regg and 
v £ I, V a\g(£,v) = Valg/((7, 1), u). Moreover, we can transform an e-optimal strategy of Q' into a s'-optimal 
strategy of Q with s' > e. 


Proof. The proof consists in replacing strategies of Q' where players can play on the borders of regions, by 
strategies of Q that play increasingly close to the border as time passes. If played close enough, the loss created 
can be chosen as small as we want. ◄ 


Consider now the region-PTG Q associated to a reset-acyclic PTG (and of polynomial size with respect to the 
original PTG). We can decompose the graph of Q into strongly connected components (that do not contain reset 
transitions by hypothesis). Consider first its bottom strongly connected components, i.e., components with no 
reset transitions exiting from them. All clock constraints are of the form [a, b] with a < b, or {a}. We denote 
by 0 = M 0 < Mi < ■ ■ ■ < Mk the constants appearing in the guards of the component (adding 0). Then, 
solving the component amounts to (i) solve the sub-game with only transitions with guard {Mk}, replacing 
then these transitions by final locations with the cost just computed, ( ii) solve the modified sub-game with only 
transitions with guard [Mk-i, Mk], by first shrinking the guards to transform the game into an SPTG, and so 
on, until Mq = 0. Once all bottom strongly connected components are solved, we replace the reset transitions 
going to them by final locations again, using the cost computed so far. We continue until no strongly connected 
components remain. Each SPTG being solvable in exponential time, the overall reset-acyclic can be solved in 
exponential time too. 



